Pass your transcription through an AI model to clean it up, reformat it, or turn spoken commands into written responses.
What Is AI Enhancement?
AI Enhancement is an optional post-processing step that runs your Whisper transcript through a Large Language Model (LLM). Whisper turns your voice into raw text. AI Enhancement then reshapes that text — fixing filler words, adding punctuation, reformatting as a list, or even responding to a spoken instruction — based on the prompt you select.
Think of it as a two-stage pipeline:
Voice → Whisper transcription → AI Enhancement (optional) → Final text
Enhancement is disabled by default. When it's off, you get Whisper's raw output. When it's on, your raw transcript is sent to whichever AI provider you've configured, processed with the active prompt, and the result is what gets typed into your app.
Enabling AI Enhancement
Open the AI Enhancement panel Click the sparkle icon in the mini recorder during dictation, or open EmberType and navigate to the AI Enhancement section.
Toggle "Enable Enhancement" on Once enabled, the prompt grid and provider options become active.
Configure an AI provider Enter an API key for at least one provider (see below) — or set up a local model through Ollama for fully offline enhancement.
Pick a mode Select Default, Assistant, or a custom prompt. The selected prompt becomes the active mode until you change it.
Built-In Modes: Default vs. Assistant
EmberType ships with two predefined modes. They behave very differently, and picking the right one matters.
Default
Clean up what I said
A transcription enhancer. It polishes your raw transcript — fixes filler words, run-ons, punctuation, and speech recognition errors — while preserving your meaning and tone. It never responds to what you said, even if you spoke a question out loud.
This is the right mode for dictating emails, messages, documents, notes, or anywhere you just want a cleaner version of your own words.
You say"um so I was thinking like maybe we should uh push the meeting to Thursday because you know the client's gonna be out"You get"I was thinking maybe we should push the meeting to Thursday because the client is going to be out."
Assistant
Do what I said
A true AI assistant. It treats what you spoke as a request and generates a response. Ask a question, get an answer. Ask for a draft, get a draft. Ask for a summary, get a summary. The LLM responds directly with no preamble or sign-off.
This is the right mode for on-the-fly AI tasks driven by voice — drafting, summarizing, translating, explaining, rewriting.
You say"Write a two-sentence reply to my boss letting her know I'll be out tomorrow for a doctor's appointment."You get"Hi, I wanted to let you know I'll be out tomorrow for a doctor's appointment. I'll follow up on any urgent items when I'm back."
Quick comparison
Behavior
Default
Assistant
Cleans filler words
Yes
Not applicable — it generates fresh text
Answers questions you ask
No (treats them as text to clean)
Yes
Generates new content
No
Yes
Preserves your exact wording
Mostly — only light edits
No — produces a response, not a transcript
Best for
Dictation, messages, docs
Drafting, summarizing, Q&A
The shortcut: Default = clean up what I said. Assistant = do what I said.
Custom Prompts
Default and Assistant are the two built-in modes, but you can create as many custom prompts as you want. A custom prompt is just a set of instructions the LLM follows when processing your transcript.
Creating a custom prompt
Open the AI Enhancement panel
Click the + button in the prompt grid
Give it a title, description, and icon
Write your prompt instructions
Save — it will appear in your prompt grid alongside Default and Assistant
Good custom prompt ideas
Email Formal — rewrite into a professional business email with a greeting and sign-off
Bullet Points — convert a stream of thoughts into a tight bulleted list
Slack Reply — casual tone, no greetings, short and direct
Code Comment — rewrite as a concise code comment explaining the why
Translate to Spanish — translate the transcript into Spanish
You can switch between prompts at any time by clicking a different card in the prompt grid. The selected prompt becomes your active mode.
Configuring AI Providers
AI Enhancement needs an LLM to do the actual processing. EmberType supports several providers — pick whichever fits your workflow, budget, and privacy preferences.
Cloud providers (API key required)
OpenAI — GPT-4o and other OpenAI models. Fast and high quality. Get an API key from platform.openai.com.
Anthropic — Claude models. Strong at following nuanced prompts. Get an API key from console.anthropic.com.
Groq — extremely fast inference on open models like Llama. Good for low-latency workflows. Get an API key from console.groq.com.
Google Gemini — Gemini models via Google AI Studio. Get an API key from aistudio.google.com.
Custom / OpenAI-compatible — point at any OpenAI-compatible endpoint (OpenRouter, Together, Fireworks, etc.)
Local (fully offline)
Ollama — run models locally on your Mac for 100% offline enhancement. Install from ollama.com, pull a model (e.g. ollama pull llama3.1), and EmberType will detect it automatically.
Adding an API key
Open the AI Enhancement panel in EmberType
Find the API Key Management section
Paste your API key into the field for the provider you want to use
Select a model from the dropdown that appears
Privacy: API keys are stored in your macOS Keychain, never in plaintext. When you use a cloud provider, your transcript is sent to that provider for processing. If you need full privacy, use Ollama for local enhancement — nothing leaves your Mac.
Clipboard and Screen Context
AI Enhancement can be given additional context to improve accuracy and relevance:
Clipboard Context — whatever is on your clipboard is passed along so the AI understands what you're replying to or working with.
Screen Context (Contextual Awareness) — text from your active window is captured via OCR and passed along so the AI understands what's on your screen. Read the full Contextual Awareness guide.
Both toggles appear under the main Enable Enhancement switch and can be turned on or off independently. They work with any mode (Default, Assistant, or custom).
Privacy and Data Handling
EmberType itself is 100% offline — Whisper transcription always runs locally on your Mac. AI Enhancement is the one feature that can send data to a third party, and only if you choose a cloud provider.
AI Enhancement is off by default — you opt in by toggling it on.
Local provider (Ollama) = fully offline — transcripts, clipboard content, and screen context never leave your Mac.
Cloud providers (OpenAI, Anthropic, Groq, Gemini, etc.) — your transcript and any enabled context is sent to that provider for processing, subject to their privacy policy.
API keys are stored in macOS Keychain — never in plaintext, never transmitted to EmberType servers.
EmberType has no servers — there is no account, no telemetry on your transcripts, and nothing to log in to.
Make sure you've entered an API key for at least one provider and selected a model.
Confirm the provider you configured is the one currently selected in the model dropdown.
Check your internet connection if you're using a cloud provider.
If using Ollama, make sure Ollama is running and you've pulled a model (ollama list).
Assistant mode isn't responding, it's just cleaning up my text
You're probably on Default mode. Open the AI Enhancement panel and click the Assistant card to make it the active mode. The selected card is highlighted.
Default mode is trying to answer my questions instead of cleaning them up
You're probably on Assistant mode. Switch to Default by clicking the Default card in the prompt grid.
The enhanced output lost important words or numbers
Try a more capable model (GPT-4o, Claude Sonnet, or a larger Ollama model).
Add your specialized vocabulary to the Dictionary so Whisper transcribes it correctly before enhancement.
Consider enabling Screen Context or Clipboard Context so the AI has more to work with.
Enhancement is too slow
Groq is by far the fastest cloud provider for most workloads.
For local enhancement, use a smaller Ollama model (e.g. llama3.1:8b instead of a 70B model).
Turn off Screen Context if you don't need it — it adds a screen capture and OCR step.
I want different behavior than Default or Assistant offer