Text to Speech Lip Sync - GPTProto API Documentation

Overview

Kling TextToVideo by Kwaivgi creates videos with lifelike lip movements that precisely sync to input text for natural speaking visuals. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Authentication

This endpoint requires authentication using a Bearer token.

Authorization

string

default:"sk-***********"

required

Your API key in the format: YOUR_API_KEY

Request Body

video

string

default:"https://example.com/video.mp4"

required

The URL of the video file for generating synchronized lip movements.Requirements:

Supported formats: .mp4, .mov
File size: Max 100MB
Duration: 2s - 10s
Resolution: 720p or 1080p only
Dimensions: Both width and height must be between 720px and 1920px

text

string

required

Text content for lip-sync video generation. This text will be converted to speech and synchronized with lip movements in the video.Limit: Maximum 120 characters

voice_id

string

default:"genshin_klee2"

required

Voice ID to use for speech synthesis. Different voice IDs provide different character voices and speaking styles.Available Voice IDs:

genshin_klee2 - Genshin Klee character voice
genshin_vindi2 - Genshin Vindi character voice
genshin_kirara - Genshin Kirara character voice
zhinen_xuesheng - Student voice
AOT - Attack on Titan style
ai_shatang - Sweet voice
ai_kaiya - Kaiya voice
oversea_male1 - Overseas male voice
ai_chenjiahao_712 - Chen Jiahao voice
girlfriend_4_speech02 - Girlfriend voice 4
chat1_female_new-3 - Female chat voice
chat_0407_5-1 - Chat voice variant
cartoon-boy-07 - Cartoon boy voice
uk_boy1 - UK boy voice
cartoon-girl-01 - Cartoon girl voice
PeppaPig_platform - Peppa Pig style voice
ai_huangzhong_712 - Huang Zhong voice
ai_huangyaoshi_712 - Huang Yaoshi voice
ai_laoguowang_712 - Lao Guo Wang voice
chengshu_jiejie - Mature sister voice
you_pingjing - Calm voice
calm_story1 - Calm storytelling voice
uk_man2 - UK man voice
laopopo_speech02 - Grandmother voice
heainainai_speech02 - Grandma voice

voice_language

string

default:"en"

The voice language corresponding to the Voice ID.Supported Languages:

zh - Chinese
en - English

voice_speed

number

default:"1"

Speech rate for text to video generation. Controls how fast the generated speech should be.Range: 0.8 - 2.0

Default: 1.0 (normal speed)
Values > 1.0 increase speed
Values < 1.0 decrease speed

Request Example

curl --location 'https://gptproto.com/api/v3/kwaivgi/kling-lipsync/text-to-video' \
--header 'Authorization: YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
  "text": "Kling lipsync on GptprotoAI is an AI-powered model. Try it now!",
  "video": "https://d1q70pf5vjeyhc.cloudfront.net/predictions/b82002c0695a48ccb6e08d23602402ed/1.mp4",
  "voice_id": "genshin_klee2",
  "voice_language": "en",
  "voice_speed": 1.3
}'

Response

data.id

string

Unique identifier for the prediction, Task Id

data.status

string

Status of the task: created, processing, completed, or failed

{
    "error": {
    "message": "Invalid signature",
    "type": "401"
}
}

Usage Notes

Face Requirements: The video should contain a clear, visible face for optimal lip synchronization. The model creates unique movement trajectories based on facial features
Natural Lip Movements: The AI generates naturally matched lip movements that synchronize precisely with the generated audio
Video Integrity: Areas outside the face remain consistent with the original video, ensuring visual integrity and continuity
Voice Models: Different voice_id values provide different character voices and speaking styles (e.g., “genshin_klee2” for anime-style voices)
Language Support: Use the voice_language parameter to specify the language for speech generation (e.g., “en” for English, “zh” for Chinese, “ja” for Japanese)
Speed Control: The voice_speed parameter allows you to control speech rate. Default is 1.0 (normal speed), values > 1.0 increase speed, values < 1.0 decrease speed
Processing Time: Processing duration varies based on text length, video length, and complexity
Query Results: Use the task ID returned in the response to query the generation status via the Query Task endpoint

API Reference

​Overview

​Authentication

​Request Body

​Request Example

​Response

​Usage Notes

Overview

Authentication

Request Body

Request Example

Response

Usage Notes