Skip to main content
POST
/
v1
/
images
/
generations
Text to Image
curl --request POST \
  --url https://gptproto.com/v1/images/generations \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "<string>",
  "prompt": "<string>",
  "size": "<string>"
}'
{
  "error": {
    "message": "Invalid signature",
    "type": "401"
  }
}

Overview

  • Text-to-Image: Generate high-quality images from simple or complex text descriptions.

Supported inputs & outputs :

Inputs: Text Outputs: Text and image

Text-to-Image

curl -X POST "https://gptproto/v1/images/generations" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "gemini-2.5-flash-image",
  "prompt": "Create a picture of a nano banana dish in a fancy restaurant with a Gemini theme",
  "size": "16:9"
}'

Authenticat

ParameterTypeRequiredDefaultRangeDescription
modelstringYesgemini-2.5-flash-image--
promptstringYes-The positive prompt for the generation.
You can input the image URL
sizestringYes-1:1, 3:2, 2:3, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9The aspect ratio of the generated media.

Size

The model defaults to generates 1:1 squares. The different ratios available and the size of the image generated are listed in this table:
Aspect ratioResolution
1:11024x1024
2:3832x1248
3:21248x832
3:4864x1184
4:31184x864
4:5896x1152
5:41152x896
9:16768x1344
16:91344x768
21:91536x672

Authentication

This endpoint requires authentication using a Bearer token.
Authorization
string
default:"sk-***********"
required
Your API key in the format: Bearer YOUR_API_KEY

Request Body

model
string
default:"gemini-2.5-flash-image"
required
The model to use for the request
prompt
string
required
Prompt parameter
size
string
default:"16:9"
Size parameter

Response

Success
200
{
    "created": 1762156444807,
    "data": [
        {
            "b64_json": "image_base64"
        }
    ],
    "output_format": "png",
    "quality": "high",
    "size": "16:9",
    "usage": {
        "input_tokens": 535,
        "input_tokens_details": {
            "image_tokens": 516,
            "text_tokens": 19
        },
        "output_tokens": 1291,
        "total_tokens": 1826
    }
}
{
  "error": {
    "message": "Invalid signature",
    "type": "401"
  }
}

Request Example

Photorealistic scenes

For realistic images, use photography terms. Mention camera angles, lens types, lighting, and fine details to guide the model toward a photorealistic result.
curl --location 'https://gptproto.com/v1/images/generations' \
--header 'Authorization: sk-xxxx' \
--header 'Content-Type: application/json' \
--data '{
  "model": "gemini-2.5-flash-image",
  "prompt": "A photorealistic close-up portrait of an elderly Japanese ceramicist with deep, sun-etched wrinkles and a warm, knowing smile. He is carefully inspecting a freshly glazed tea bowl. The setting is his rustic, sun-drenched workshop with pottery wheels and shelves of clay pots in the background. The scene is illuminated by soft, golden hour light streaming through a window, highlighting the fine texture of the clay and the fabric of his apron. Captured with an 85mm portrait lens, resulting in a soft, blurred background (bokeh). The overall mood is serene and masterful.",
  "size": "1:1"
}'
Alt text describing the image

Stylized illustrations & stickers

To create stickers, icons, or assets, be explicit about the style and request a transparent background.
curl --location 'https://gptproto.com/v1/images/generations' \
--header 'Authorization: sk-xxxx' \
--header 'Content-Type: application/json' \
--data '{
  "model": "gemini-2.5-flash-image",
  "prompt": "A kawaii-style sticker of a happy red panda wearing a tiny bamboo hat. It'"'"'s munching on a green bamboo leaf. The design features bold, clean outlines, simple cel-shading, and a vibrant color palette. The background must be white.",
  "size": "1:1"
}'
Alt text describing the image

Accurate text in images

Gemini excels at rendering text. Be clear about the text, the font style (descriptively), and the overall design.
curl --location 'https://gptproto.com/v1/images/generations' \
--header 'Authorization: sk-xxxx' \
--header 'Content-Type: application/json' \
--data '{
  "model": "gemini-2.5-flash-image",
  "prompt": "Create a modern, minimalist logo for a coffee shop called '"'"'The Daily Grind'"'"'. The text should be in a clean, bold, sans-serif font. The design should feature a simple, stylized icon of a a coffee bean seamlessly integrated with the text. The color scheme is black and white.",
  "size": "1:1"
}'
Alt text describing the image

Product mockups & commercial photography

Gemini excels at rendering text. Be clear about the text, the font style (descriptively), and the overall design.
curl --location 'https://gptproto.com/v1/images/generations' \
--header 'Authorization: sk-xxxx' \
--header 'Content-Type: application/json' \
--data '{
  "model": "gemini-2.5-flash-image",
  "prompt": "A high-resolution, studio-lit product photograph of a minimalist ceramic coffee mug in matte black, presented on a polished concrete surface. The lighting is a three-point softbox setup designed to create soft, diffused highlights and eliminate harsh shadows. The camera angle is a slightly elevated 45-degree shot to showcase its clean lines. Ultra-realistic, with sharp focus on the steam rising from the coffee. Square image.",
  "size": "1:1"
}'
Alt text describing the image

Minimalist & negative space design

Excellent for creating backgrounds for websites, presentations, or marketing materials where text will be overlaid.
curl --location 'https://gptproto.com/v1/images/generations' \
--header 'Authorization: sk-xxxx' \
--header 'Content-Type: application/json' \
--data '{
  "model": "gemini-2.5-flash-image",
  "prompt": "A minimalist composition featuring a single, delicate red maple leaf positioned in the bottom-right of the frame. The background is a vast, empty off-white canvas, creating significant negative space for text. Soft, diffused lighting from the top left. Square image.",
  "size": "1:1"
}'
Alt text describing the image

Sequential art (Comic panel / Storyboard)

Builds on character consistency and scene description to create panels for visual storytelling.
curl --location 'https://gptproto.com/v1/images/generations' \
--header 'Authorization: sk-xxxx' \
--header 'Content-Type: application/json' \
--data '{
  "model": "gemini-2.5-flash-image",
  "prompt": "A single comic book panel in a gritty, noir art style with high-contrast black and white inks. In the foreground, a detective in a trench coat stands under a flickering streetlamp, rain soaking his shoulders. In the background, the neon sign of a desolate bar reflects in a puddle. A caption box at the top reads \"The city was a tough place to keep secrets.\" The lighting is harsh, creating a dramatic, somber mood. Landscape.",
  "size": "1:1"
}'
Alt text describing the image