Product
Clarifying Follow-Up
One extra voice turn when FlowLens needs just enough context to finish the answer well.
What it does
If the first multimodal pass cannot answer confidently, FlowLens can ask a single clarifying question and give the user one more voice turn to resolve the ambiguity.
Input
- the initial screenshot
- the first transcript
- the first assistant response
- one spoken follow-up answer from the user
Output
- an updated structured response that replaces the earlier draft
- a better final
actionable_output - TTS playback for the new short summary if voice playback is enabled
- the same overlay experience, now grounded by the additional detail
Why it is useful
This is the smallest possible conversational loop that still improves quality. It gives the model a way to recover when the screenshot is incomplete without turning the product into a full chat session.
Current implementation intentionally does not do
- open-ended conversations
- more than one follow-up turn
- persistent memory across separate invocations
- threaded session history