Skip to main content
POST
/
v1
/
chat
/
completions
gemini-2.5-flash-image (image edit)
curl --request POST \
  --url https://api.example.com/v1/chat/completions
Gemini’s openai format for the image edit API.
curl -X POST "https://gptproto.com/v1/chat/completions" \
  -H "Authorization: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "messages": [
    {
      "role": "user",
      "content": "Two people standing together",
      "experimental_attachments": [
        {
          "contentType": "image/png",
          "url": "https://mgszhhytxjyitifbalro.supabase.co/storage/v1/object/public/chat-attachments/uploads/guakmauqtr.png"
        },
        {
          "contentType": "image/jpeg",
          "url": "https://mgszhhytxjyitifbalro.supabase.co/storage/v1/object/public/chat-attachments/uploads/3ox9ikxbcxc.jpeg"
        }
      ]
    }
  ],
  "model": "gemini-2.5-flash-image"
}'
{
  "error": {
    "message": "Invalid signature",
    "type": "401"
  }
}

Parameters

ParameterTypeRequiredDefaultDescription
messagesarray✅ Yes-Array of message objects for the conversation role of user.
contentstring✅ Yes-The prompt for the generation.
experimental_attachmentsarray✅ Yes-Supported MIME types:image/png, image/jpeg, image/webp
urlstring✅ Yes-image url.
modelstring✅ Yesgemini-2.5-flash-image

Image Edit

The default output size for chat mode is 1:1 (1024x1024)

Response

{
  "candidates": [
      {
          "content": {
              "role": "model",
              "parts": [
                  {
                      "inlineData": {
                          "mimeType": "image/png",
                          "data": "base64"
                      }
                  }
              ]
          },
          "finishReason": "STOP"
      }
  ],
  "usageMetadata": {
      "promptTokenCount": 1302,
      "candidatesTokenCount": 1290,
      "totalTokenCount": 2592,
      "thoughtsTokenCount": 0,
      "promptTokensDetails": [
          {
              "modality": "IMAGE",
              "tokenCount": 1290
          },
          {
              "modality": "TEXT",
              "tokenCount": 12
          }
      ]
  },
  "modelVersion": "gemini-2.5-flash-image"
}

Adding and removing elements

Provide an image and describe your change. The model will match the original image’s style, lighting, and perspective.
InputOutput
InputOutput
A photorealistic picture of a fluffy ginger cat…Using the provided image of my cat, please add a small, knitted wizard hat…

Advanced composition: Combining multiple images

Provide multiple images as context to create a new, composite scene. This is perfect for product mockups or creative collages.
Input1Input2Output
InputInputOutput
A professionally shot photo of a blue floral summer dress…Full-body shot of a woman with her hair in a bun…Create a professional e-commerce fashion photo…

Best Practices

To elevate your results from good to great, incorporate these professional strategies into your workflow.
  • Be Hyper-Specific: The more detail you provide, the more control you have. Instead of “fantasy armor,” describe it: “ornate elven plate armor, etched with silver leaf patterns, with a high collar and pauldrons shaped like falcon wings.”
  • Provide Context and Intent: Explain the purpose of the image. The model’s understanding of context will influence the final output. For example, “Create a logo for a high-end, minimalist skincare brand” will yield better results than just “Create a logo.”
  • Iterate and Refine: Don’t expect a perfect image on the first try. Use the conversational nature of the model to make small changes. Follow up with prompts like, “That’s great, but can you make the lighting a bit warmer?” or “Keep everything the same, but change the character’s expression to be more serious.”
  • Use Step-by-Step Instructions: For complex scenes with many elements, break your prompt into steps. “First, create a background of a serene, misty forest at dawn. Then, in the foreground, add a moss-covered ancient stone altar. Finally, place a single, glowing sword on top of the altar.”
  • Use “Semantic Negative Prompts”: Instead of saying “no cars,” describe the desired scene positively: “an empty, deserted street with no signs of traffic.”
  • Control the Camera: Use photographic and cinematic language to control the composition. Terms like wide-angle shot, macro shot, low-angle perspective.

Limitations

  • For best performance, use the following languages: EN, es-MX, ja-JP, zh-CN, hi-IN.
  • Image generation does not support audio or video inputs.
  • The model won’t always follow the exact number of image outputs that the user explicitly asks for.
  • The model works best with up to 3 images as an input.
  • When generating text for an image, Gemini works best if you first generate the text and then ask for an image with the text.
  • Uploading images of children is not currently supported in EEA, CH, and UK.