FlowLens
Current product and engineering documentation for the screen-aware desktop voice overlay.
Project documentation
A screen-aware desktop voice overlay that helps with visible work without forcing a context switch.
FlowLens is packaged as an Electron desktop app for macOS and Windows. Install it once, complete setup in-app, then use the global hotkey and tray/menu-bar controls instead of rebuilding or rerunning terminal commands.
Watch the demo
From setup to screen-aware answer in one desktop flow.
The video shows the packaged setup surface, provider and ElevenLabs configuration, hotkey invocation, live overlay, and copy-ready response.
Mode 01
Prompt Doctor
Diagnoses weak prompts on screen and rewrites them into copy-ready, better-constrained prompts.
Mode 02
Error Explainer
Reads visible stack traces, logs, and failed commands, then returns likely cause and next debugging step.
Mode 03
Writing Improver
Tightens technical writing such as PR descriptions, issue drafts, comments, and notes.
Current shell
Installable desktop app
NSIS, portable, DMG, and ZIP package targets wrap setup, tray/menu-bar control, launch-at-login, update checks, diagnostics, and cleanup.
What changed since the first docs pass
The original docs described a hackathon prototype. The current project is closer to a production-ready desktop assistant:
- setup moved from manual JSON editing into a first-run wizard and settings window
- API keys moved out of renderer snapshots and into encrypted Electron
safeStorage - config now uses Electron
userDataat runtime, with migration from the legacy~/.flowlens/config.json - the overlay can be dragged, remembers custom position, and can reset to bottom right
- the matrix visualizer is driven by live microphone amplitude and TTS state
- tray/menu-bar controls expose settings, setup, listening, updates, launch-at-login, reset, and quit
- packaging uses
electron-builderfor Windows NSIS/portable builds and macOS DMG/ZIP builds - provider setup now includes OpenCode Go and Gemini OpenAI-compatible presets alongside custom compatible endpoints
- TTS speaks a short sanitized summary instead of reading entire markdown/code answers aloud
How it works in one pass
- Press the global hotkey.
- If setup is incomplete, FlowLens opens setup instead of starting capture.
- If setup is valid, the overlay appears, captures the screen, and starts microphone recording.
- The renderer streams audio chunks and live VU levels while speech-turn detection waits through natural pauses.
- The main process sends audio to ElevenLabs
scribe_v2. - The transcript, screenshot, selected mode, and conversation state go to the configured OpenAI-compatible provider.
- The provider returns
spoken_summary,card_content,clarifying_question, andactionable_output. - The overlay renders the answer, optionally plays ElevenLabs TTS, and lets the user copy the final output.
Why it exists
Developer help is usually trapped behind context switching. You copy an error into a browser tab, rewrite a prompt in a side tool, or clean up technical writing in another editor. Each jump costs attention.
FlowLens removes that jump. It sees what is already on screen, listens to the request in natural language, and answers inside the same workspace. The product is built for short, high-value interventions rather than broad autonomous desktop control.
Current scope
- Three focused modes: Prompt Doctor, Error Explainer, and Writing Improver.
- Explicit invocation only. No wake word, no passive background capture, and no autonomous desktop control.
- Primary-screen capture for reliability, with the overlay hidden during capture.
- One clarifying follow-up turn inside a single invocation.
- Official platform support is macOS and Windows.
- Packaging and update infrastructure exists; public non-technical distribution still requires signed/notarized release credentials and published artifacts.
Skim the right page next
Onboarding
Getting Started
Setup, packaged app expectations, local developer commands, and how to run this docs site.
Product
How FlowLens Works
The runtime flow from hotkey to overlay response, including capture, STT, provider request, TTS, and follow-up.
Engineering
Architecture
Electron process boundaries, provider adapters, secret storage, setup gating, and packaged lifecycle.
Workflow
Built with Kiro
How spec-driven development shaped the product boundary, implementation sequence, and later production-readiness work.