Analyze and understand image content using vision-enabled models
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
model | string | ✅ Yes | gpt-41-2025-04-14 | The model to use for the request. Must be a vision-enabled model. |
messages | array | ✅ Yes | - | Array of message objects with role and content |
stream | boolean | ❌ No | false | Whether to stream the response |
messages array should have the following structure:
| Field | Type | Required | Description |
|---|---|---|---|
role | string | ✅ Yes | Role of the message sender. Can be: user or assistant |
content | array | ✅ Yes | Array of content objects (can include text and images) |
content array should have the following structure:
| Field | Type | Required | Description |
|---|---|---|---|
type | string | ✅ Yes | Type of content. Can be: text or image_url |
text | string | ✅ Yes (if type is text) | Text prompt content |
image_url | object | ✅ Yes (if type is image_url) | Image URL object with url field containing image URL or base64 encoded image |