Documentation Index
Fetch the complete documentation index at: https://private-7c7dfe99-mintlify-ea9e18cc.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Vision lets users upload images for an agent to analyze. The agent passes the image to a vision-capable model, which describes, summarizes, or answers questions about what’s in it.
Enable vision capabilities
Vision only works with models that support image inputs. If the selected model can’t handle image inputs, the upload control on the message composer is disabled.
Switch to a vision-capable model in model parameters to re-enable it.
Use vision capabilities
Click the paperclip icon at the bottom-left of the message composer and choose Upload to Provider to attach an image — a screenshot, a photo, a chart, a diagram. Then ask any question that requires reading the image: “What’s wrong with this query plan?”, “Transcribe the text in this screenshot,” or “Compare this dashboard to last week’s.”
The agent treats the image as part of the message context, so follow-up questions in the same turn can reference what it saw without re-uploading.
Vision pairs well with the code interpreter for image-driven analysis — for example, the agent reads numbers off a screenshot and then runs Python to compute totals — and with web search when an image references something the model needs to look up.