Image + Text-to-Image (Editing): Provide an image and use text prompts to add, remove, or modify elements, change the style, or adjust the color grading.
Multi-Image to Image (Composition & Style Transfer): Use multiple input images to compose a new scene or transfer the style from one image to another.
Iterative Refinement: Engage in a conversation to progressively refine your image over multiple turns, making small adjustments until it’s perfect.
High-Fidelity Text Rendering: Accurately generate images that contain legible and well-placed text, ideal for logos, diagrams, and posters.
Provide an image and describe your change. The model will match the original image’s style, lighting, and perspective.
Copy
curl --location 'https://gptproto/v1/chat/completions' \--header 'Authorization: sk-xxxx' \--header 'Content-Type: application/json' \--data '{ "messages": [ { "role": "user", "content": "Using the provided image of my cat, please add a small, knitted wizard hat on its head. Make it look like it's sitting comfortably and not falling off.", "experimental_attachments": [ { "contentType": "image/png", "url": "image_url" } ], "model": "gemini-2.5-flash-image-preview" } ]}'
Input
Output
A photorealistic picture of a fluffy ginger cat…
Using the provided image of my cat, please add a small, knitted wizard hat…
Conversationally define a “mask” to edit a specific part of an image while leaving the rest untouched.
Copy
curl --location 'https://gptproto/v1/chat/completions' \--header 'Authorization: sk-xxxx' \--header 'Content-Type: application/json' \--data '{ "messages": [ { "role": "user", "content": "Using the provided image of a living room, change only the blue sofa to be a vintage, brown leather chesterfield sofa. Keep the rest of the room, including the pillows on the sofa and the lighting, unchanged.", "experimental_attachments": [ { "contentType": "image/png", "url": "image_url" } ], "model": "gemini-2.5-flash-image-preview" } ]}'
Input
Output
A wide shot of a modern, well-lit living room…
Using the provided image of a living room, change only the blue sofa to be a vintage, brown leather chesterfield sofa…
Provide an image and ask the model to recreate its content in a different artistic style.
Copy
curl --location 'https://gptproto/v1/chat/completions' \--header 'Authorization: sk-xxxx' \--header 'Content-Type: application/json' \--data '{ "messages": [ { "role": "user", "content": "Transform the provided photograph of a modern city street at night into the artistic style of Vincent van Gogh's 'Starry Night'. Preserve the original composition of buildings and cars, but render all elements with swirling, impasto brushstrokes and a dramatic palette of deep blues and bright yellows.", "experimental_attachments": [ { "contentType": "image/png", "url": "image_url" } ], "model": "gemini-2.5-flash-image-preview" } ]}'
Input
Output
A photorealistic, high-resolution photograph of a busy city street…
Transform the provided photograph of a modern city street at night…
Provide multiple images as context to create a new, composite scene. This is perfect for product mockups or creative collages.
Copy
curl --location 'https://gptproto/v1/chat/completions' \--header 'Authorization: sk-xxxx' \--header 'Content-Type: application/json' \--data '{ "messages": [ { "role": "user", "content": "Create a professional e-commerce fashion photo. Take the blue floral dress from the first image and let the woman from the second image wear it. Generate a realistic, full-body shot of the woman wearing the dress, with the lighting and shadows adjusted to match the outdoor environment.", "experimental_attachments": [ { "contentType": "image/png", "url": "image_url_1" },{ "contentType": "image/png", "url": "image_url_2" } ], "model": "gemini-2.5-flash-image-preview" } ]}'
Input1
Input2
Output
A professionally shot photo of a blue floral summer dress…
To ensure critical details (like a face or logo) are preserved during an edit, describe them in great detail along with your edit request.
Copy
curl --location 'https://gptproto/v1/chat/completions' \--header 'Authorization: sk-xxxx' \--header 'Content-Type: application/json' \--data '{ "messages": [ { "role": "user", "content": "Take the first image of the woman with brown hair, blue eyes, and a neutral expression. Add the logo from the second image onto her black t-shirt. Ensure the woman's face and features remain completely unchanged. The logo should look like it's naturally printed on the fabric, following the folds of the shirt.", "experimental_attachments": [ { "contentType": "image/png", "url": "image_url_1" },{ "contentType": "image/png", "url": "image_url_2" } ], "model": "gemini-2.5-flash-image-preview" } ]}'
Input1
Input2
Output
A professional headshot of a woman with brown hair and blue eyes…
A simple, modern logo with the letters ‘G’ and ‘A’…
Take the first image of the woman with brown hair, blue eyes, and a neutral expression…
To elevate your results from good to great, incorporate these professional strategies into your workflow.
Be Hyper-Specific: The more detail you provide, the more control you have. Instead of “fantasy armor,” describe it: “ornate elven plate armor, etched with silver leaf patterns, with a high collar and pauldrons shaped like falcon wings.”
Provide Context and Intent: Explain the purpose of the image. The model’s understanding of context will influence the final output. For example, “Create a logo for a high-end, minimalist skincare brand” will yield better results than just “Create a logo.”
Iterate and Refine: Don’t expect a perfect image on the first try. Use the conversational nature of the model to make small changes. Follow up with prompts like, “That’s great, but can you make the lighting a bit warmer?” or “Keep everything the same, but change the character’s expression to be more serious.”
Use Step-by-Step Instructions: For complex scenes with many elements, break your prompt into steps. “First, create a background of a serene, misty forest at dawn. Then, in the foreground, add a moss-covered ancient stone altar. Finally, place a single, glowing sword on top of the altar.”
Use “Semantic Negative Prompts”: Instead of saying “no cars,” describe the desired scene positively: “an empty, deserted street with no signs of traffic.”
Control the Camera: Use photographic and cinematic language to control the composition. Terms like wide-angle shot, macro shot, low-angle perspective.