FlowLens

Architecture

Electron boundaries, setup gating, encrypted secrets, provider adapters, voice services, and packaged lifecycle.

System shape

Desktop shell

Main process

Owns windows, hotkeys, tray/menu-bar lifecycle, config, encrypted secrets, permissions, updates, diagnostics, screen capture, STT, TTS, and provider orchestration.

UI boundary

Renderer + preload

Renders overlay/setup/settings, records microphone audio, streams chunks and VU levels, plays TTS audio, and talks to main only through typed IPC.

External services

Voice + reasoning

ElevenLabs handles STT and TTS. An OpenAI-compatible provider handles screenshot-plus-transcript reasoning.

The main process owns privileged work and secrets. The renderer owns interaction, audio capture, visualization, and presentation.

Electron app structure

FlowLens is a desktop-first Electron app because the product depends on OS-level behavior:

  • global shortcuts
  • transparent always-on-top overlay windows
  • screen capture
  • tray/menu-bar background control
  • launch-at-login
  • packaged installers
  • secure local secret storage through Electron safeStorage

Main process responsibilities

  • enforce the single-instance lock
  • create the overlay window and the setup/settings window
  • register the global hotkey
  • gate overlay invocation until onboarding is complete
  • create and update the tray/menu-bar menu
  • persist config through Electron userData
  • migrate legacy config from ~/.flowlens/config.json
  • encrypt and decrypt API keys through the secret store
  • check platform and permission status
  • hide the overlay before screen capture
  • assemble the invocation pipeline
  • resolve the active OpenAI-compatible provider
  • call ElevenLabs STT and TTS
  • handle update checks, launch-at-login, diagnostics, and cleanup reset

Renderer and preload responsibilities

  • render the overlay, setup wizard, and settings window
  • expose narrow IPC methods through preload
  • start and stop microphone capture
  • stream audio chunks to main before sending flowlens:audio-stop
  • compute live VU levels for the Matrix component
  • wait through natural pauses using speech-turn detection
  • play TTS audio chunks streamed from main
  • render structured response cards and copyable output
  • request settings, voice lists, permissions, diagnostics, cleanup, and update checks through IPC

Runtime flow

01

Hotkey gate

The global shortcut first checks onboarding status. If setup is incomplete, the setup window opens. If complete, the invocation pipeline starts.

02

Screen capture

The main process hides the overlay, waits briefly for the window manager, captures the primary screen through desktopCapturer, and restores the overlay.

03

Audio capture

The renderer creates MicCapture, records with MediaRecorder, streams chunks to main, and exposes analyser data for the matrix visualizer.

04

Voice and provider

Audio goes to ElevenLabs scribe_v2. The transcript, screenshot, active mode, and conversation state go to the active provider.

05

Structured answer

The provider response is parsed into spoken_summary, card_content, clarifying_question, and actionable_output.

06

Response playback

The overlay renders the answer. If voice playback is enabled, ElevenLabs TTS streams audio chunks back to the renderer.

Provider adapter layer

The current adapter is OpenAI-compatible, with small provider compatibility branches instead of separate full adapters. It reads:

  • providerKey for the active secret
  • providerBaseUrl for the API root
  • providerProtocol for the wire protocol
  • model for the request body

For standard providers, the adapter posts to /chat/completions, sends the screenshot as an image content part, requests JSON output, and validates the structured response before the overlay sees it.

Provider-specific behavior is isolated here:

Provider familyHandling
OpenAI-compatibleNormal chat-completions payload with image content and native JSON response format
Gemini compatible endpointUses Google's OpenAI-compatible base URL with the same screenshot-plus-text payload shape
OpenCode GoInfers openai-chat, anthropic-messages, or alibaba-chat behavior per model; omits unsupported JSON response-format parameters; disables the short FlowLens timeout for long-running calls; falls back from prompt-only markdown into a structured response when needed

The goal is to keep the rest of the app provider-agnostic. The overlay, setup flow, response card, TTS, and copyable output continue to work against the same response contract.

Structured response contract

{
  "spoken_summary": "Short answer for TTS and compact UI.",
  "card_content": "Markdown body for the overlay.",
  "clarifying_question": null,
  "actionable_output": "Copy-ready final text."
}

This contract keeps the UI stable. The model can reason freely, but it must return a predictable shape.

Settings and onboarding architecture

Setup and settings are normal BrowserWindows loaded with a role query:

  • role=setup loads the first-run wizard
  • role=settings loads the full settings surface

The wizard saves draft settings, runs connection checks, and blocks completion until provider, ElevenLabs, microphone, and screen checks pass. The tray and hotkey both rely on the same onboarding status.

Overlay layout and positioning

Overlay size is state-driven:

StateTypical size
recordingcompact recording layout
processingcompact analysis layout
response compactscrollable response layout
response expandedlarger reading layout
settingsbounded 460 x 560 layout

The overlay defaults to bottom right. If the user drags it, the main process persists a custom top-left position with display ID and clamps it into the nearest work area on resize.

Security and privacy boundaries

BoundaryDecision
Raw API keysStored encrypted in main-process secret store
Renderer settingsReceives masked key status only
Screenshot captureExplicit invocation only
Microphone captureActive request only
DiagnosticsRedacts API keys, auth tokens, screenshots, audio, transcripts, and response-like content
Factory resetClears FlowLens-owned settings, secrets, logs, and overlay position

Packaged lifecycle

FlowLens uses electron-builder with:

  • Windows NSIS and portable targets
  • macOS DMG and ZIP targets
  • app ID com.flowlens.desktop
  • GitHub Releases as the update provider
  • extra tray icon resource packaging
  • electron-updater status tracking

The packaged app removes the developer-run loop from normal use. A user installs once, finishes onboarding in a normal setup window, optionally enables launch-at-login, and then interacts through the global hotkey and tray/menu-bar menu. The app stays alive in the background after windows close unless the user chooses Quit.

On this page