Extract text and analyze images using Gemini 2.5 Pro model to generate detailed descriptions and insights
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
model | string | ✅ Yes | - | - | Model ID used to generate the response, formatted as models/{model}. |
method | string | ✅ Yes | - | generateContentstreamGenerateContent | Method to use for content generation. |
| Method | Example | Description |
|---|---|---|
generateContent | https://gptproto.com/v1beta/models/gemini-2.5-pro:generateContent | Generates a complete response all at once. Best for applications where you need the full response before processing. |
streamGenerateContent | https://gptproto.com/v1beta/models/gemini-2.5-pro:streamGenerateContent | Streams the response incrementally as it’s generated. Ideal for chat interfaces and real-time applications where latency is important. |
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
contents | array | ✅ Yes | - | - | Content of the current conversation with the model. For single-turn queries, this contains one instance. For multi-turn queries (e.g., chat), this contains the conversation history and the latest request. |
>contents.role | string | ✅ Yes | - | usermodel | The role of the message sender. |
>contents.parts | array | ✅ Yes | - | - | The content parts of the message, which can contain different types of content (text, inlineData, etc.). |
>>contents.parts.text | string | ✅ Yes | - | - | Text content of the part. For multimodal input details, see Multimodal Input. |
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
tools | array | ❌ No | - | - | List of tools the model may use to generate the next response. Supported tools include Function and codeExecution. |
toolConfig | object | ❌ No | - | - | Configuration for any tools specified in the request. |
safety_settings | array | ❌ No | - | - | List of unique SafetySetting instances for filtering unsafe content. Each SafetyCategory should have at most one setting. see SafetySetting |
generation_config | object | ❌ No | - | - | Configuration options for content generation. |
>generation_config.temperature | number | ❌ No | - | 0.0-1.0 | Controls the randomness of the output. Lower values produce more deterministic results. |
>generation_config.top_p | number | ❌ No | - | 0.0-1.0 | Nucleus sampling probability threshold. |
>generation_config.top_k | integer | ❌ No | - | - | Top-k sampling parameter. |
>generation_config.max_output_tokens | integer | ❌ No | - | - | Maximum number of tokens to generate. |
>generation_config.thinking_config | object | ❌ No | - | - | Configuration for thinking functionality. If set for models that don’t support thinking, the system will return an error. See Thinking Config for details. |
>generation_config.image_config | object | ❌ No | - | - | Configuration for image generation. If set for models that don’t support these configuration options, the system will return an error. See Image Config for details. |
>generation_config.mediaResolution | enum | ❌ No | - | MEDIA_RESOLUTION_UNSPECIFIEDMEDIA_RESOLUTION_LOWMEDIA_RESOLUTION_MEDIUMMEDIA_RESOLUTION_HIGH | If specified, uses the specified media resolution. Note: This field describes the resolution of input media. To control the resolution of output images, use the imageConfig field instead. |
| parameter | Type | Required | Default | Range / Example | Description | |
|---|---|---|---|---|---|---|
contents.parts | array | ✅ Yes | — | textinlineDatafileData | The content parts of the message, which can contain different types of content . | |
>contents.parts.inlineData | object | ❌ No | - | - | Inline media content. If used, data must be base64-encoded. | |
>>contents.parts.inlineData.mimeType | string | ✅ Yes (if inline_data is used) | - | application/pdfimage/jpeg | The IANA-standard MIME type of the source data. If the provided MIME type is not supported, the system will return an error. | |
>>contents.parts.inlineData.data | string | ✅ Yes (if inline_data is used) | - | - | Base64-encoded media data. | |
>contents.parts.fileData | object | ❌ No | - | - | File media content. If used, fileUri must be provided. | |
>>contents.parts.fileData.mimeType | string | ✅ Yes (if file_data is used) | - | application/pdfimage/jpeg | The IANA-standard MIME type of the source data. If the provided MIME type is not supported, the system will return an error. | |
>>contents.parts.fileData.fileUri | string | ✅ Yes (if file_data is used) | - | - | The URI of the file to be processed. |
| parameter | Type | Required | Default | Range / Example | Description |
|---|---|---|---|---|---|
category | string | ✅ Yes | - | HARM_CATEGORY_HATE_SPEECHHARM_CATEGORY_SEXUALLY_EXPLICITHARM_CATEGORY_DANGEROUS_CONTENTHARM_CATEGORY_HARASSMENTHARM_CATEGORY_CIVIC_INTEGRITY | The harm category to apply the safety setting to. |
threshold | string | ✅ Yes | - | BLOCK_ONLY_HIGHBLOCK_MEDIUM_AND_ABOVEBLOCK_LOW_AND_ABOVEBLOCK_NONE | The threshold for blocking content. |
| parameter | Type | Required | Default | Range / Example | Description |
|---|---|---|---|---|---|
temperature | number | ❌ No | - | 0.0-1.0 | Controls the randomness of the output. |
top_p | number | ❌ No | - | 0.0-1.0 | Nucleus sampling threshold. |
top_k | integer | ❌ No | - | - | Top-k sampling parameter. |
max_output_tokens | integer | ❌ No | - | - | Maximum number of tokens to generate. |
stop_sequences | array | ❌ No | - | - | Sequences at which to stop generation. |
response_mime_type | string | ❌ No | - | text/plainapplication/json | MIME type of the response. |
Note:thinking_levelis only supported on Gemini 3.0 and above. It cannot be used together withthinking_budget; doing so will return an error.
| parameter | Type | Required | Default | Range / Example | Description |
|---|---|---|---|---|---|
thinking_config | object | ❌ No | - | - | Configuration for thinking functionality. |
>thinking_config.include_thoughts | boolean | ❌ No | - | - | Indicates whether to include thoughts in the response. If true, thoughts are only returned when thinking is enabled. |
>thinking_config.thinking_budget | integer | ❌ No | - | - | Specifies the maximum number of tokens for generated thoughts. |
>thinking_config.thinking_level | enum | ❌ No | THINKING_LEVEL_UNSPECIFIED | THINKING_LEVEL_UNSPECIFIEDHIGHLOW | Controls the maximum depth of the model’s internal reasoning process before generating a response. If not specified, the default is HIGH. Recommended for Gemini 3 or newer models. Using it with older models may cause errors. |
| parameter | Type | Required | Default | Range / Example | Description |
|---|---|---|---|---|---|
image_config | object | ❌ No | - | - | Configuration for image generation. |
>image_config.aspect_ratio | string | ❌ No | - | 1:12:33:23:44:39:1616:921:9 | Aspect ratio of the generated image. If not specified, the model will select the appropriate aspect ratio based on the specified content. |
>image_config.image_size | string | ❌ No | - | 1k2k4k | Approximate size of the generated image. If not specified, the model will use the default value of 1k. |
| Error Code | Error Name | Description |
|---|---|---|
| 401 | Unauthorized | API key is missing or invalid |
| 403 | Forbidden | Your API key doesn’t have permission to access this resource, or insufficient balance for the requested operation |
| 429 | Too Many Requests | You’ve exceeded your rate limit |
| 500 | Internal server error | An internal server error occurred |
| 503 | Content policy violation | Content blocked due to safety concerns (actual status code is 400) |