Image Vision
How agents analyze screenshots and images attached in messaging channels.
Image Vision
CodeSpar agents can analyze screenshots and images attached to messages in supported channels. When you send a screenshot alongside a command, the agent sees the image and uses it as context for its response.
Estimated time: 2 minutes to configure, then attach images to any command
How It Works
The image vision flow has four steps:
- User attaches an image in Slack alongside a message (e.g.,
@codespar fix this contrast issuewith a screenshot) - The adapter extracts the image URL from the Slack event payload (
url_private_downloadfield) - The agent downloads and encodes the image using the bot token for authentication, then base64-encodes the image data
- The image is sent to Claude as an image content block alongside the text instruction
Claude processes both the text and the image together, allowing it to understand visual context such as UI bugs, layout issues, or error messages shown in screenshots.
Supported Formats
| Format | Supported | Max Size |
|---|---|---|
| PNG | Yes | 4 MB |
| JPEG | Yes | 4 MB |
| GIF | Yes | 4 MB |
| WEBP | Yes | 4 MB |
Images larger than 4 MB are skipped with a warning message. Only raster image formats are supported -- SVG and PDF files are not processed as images.
Use Cases
UI Bug Reports
Attach a screenshot showing a visual bug and describe the issue:
The agent sees the screenshot, identifies the relevant CSS files using the smart file picker (which also receives the image), and generates a fix with appropriate color values.
Error Screenshots
Share a screenshot of an error message or stack trace:
The agent reads the error from the image and correlates it with the codebase to propose a fix.
Design Feedback
Provide annotated mockups or design screenshots:
The agent analyzes the design screenshot and generates code changes to match the visual specification.
Smart File Picker Integration
When an image is attached, it is also included in the smart file picker prompt. Claude Haiku sees the image when deciding which files are relevant to the task. This means the file picker can make better selections based on visual context -- for example, selecting CSS files when the image shows a styling issue, or selecting component files when the image shows a specific UI element.
Slack Setup
To enable image vision in Slack, the bot needs the files:read OAuth scope. This allows the bot to download images attached to messages.
Add the Scope
- Go to your Slack App configuration
- Select your CodeSpar app
- Navigate to OAuth & Permissions
- Under Bot Token Scopes, add
files:read - Reinstall the app to your workspace to apply the new scope
Verify
After adding the scope, attach an image to a message with a CodeSpar command. The agent should acknowledge the image in its response. If the image is not detected, check that:
- The
files:readscope is listed in your bot's scopes - The app has been reinstalled after adding the scope
- The image is under 4 MB
- The image format is PNG, JPEG, GIF, or WEBP
Channel Support
| Channel | Status | Notes |
|---|---|---|
| Slack | Supported | Requires files:read bot scope, downloads via bot token auth |
| Discord | Supported | Public attachment URLs from message.attachments, no auth needed |
| Telegram | Supported | Resolves file URLs via getFile API, downloads from Telegram CDN |
| Supported | Extracts imageMessage.url / mediaUrl from Evolution API payload | |
| CLI | Not applicable | CLI does not support image attachments |
Channel Implementation Details
Each channel handles image extraction differently based on its platform API.
Slack
Slack image URLs use the url_private_download field from the event payload. These URLs require authentication. The adapter downloads the image using a Bearer token (the bot's SLACK_BOT_TOKEN) in the Authorization header. The files:read OAuth scope must be granted to the bot for this to work.
Discord
Discord attachments are included in the message.attachments collection on the message event. Each attachment has a public url property that can be downloaded directly without any authentication. This makes Discord the simplest channel for image vision since no additional scopes or tokens are required beyond the standard bot setup.
Telegram
Telegram does not include direct file URLs in message events. Instead, the message payload contains a file_id. The adapter calls the Bot API's getFile method with this ID:
This returns a file_path, which is then used to construct the download URL:
The file is downloaded from Telegram's CDN and base64-encoded for Claude.
WhatsApp image messages arrive via the Evolution API webhook payload. The image URL is available in the imageMessage.url or mediaUrl field, depending on the Evolution API version. The adapter downloads the image directly from this URL. No additional authentication is needed since the Evolution API provides pre-authenticated media URLs.
Next Steps
- Dev Agent -- smart file picker and image vision details
- Creating PRs with the Dev Agent -- end-to-end PR workflow
- Multi-Channel Setup -- configure multiple channels