kling-lipsync (audio to video) - GPTProto API Documentation

Kling’s GPTProto format for the audio to video API.

curl --location 'https://gptproto.com/api/v3/kwaivgi/kling-lipsync/audio-to-video' \
--header 'Authorization: YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
  "audio": "https://d1q70pf5vjeyhc.cloudfront.net/predictions/1a91a2f60dff4594b2ad9d5396aef8de/1.mp3",
  "video": "https://d2p7pge43lyniu.cloudfront.net/output/2e375cce-8989-4d72-9629-be85e616b295-u1_07850bf7-c365-46c6-a636-9e5fa6822173.mp4"
}'

{
    "error": {
    "message": "Invalid signature",
    "type": "401"
}
}

Parameters

Parameter	Type	Required	Default	Description
`audio`	string	✅ Yes	-	The URL pointing to the audio file that will be used for generating synchronized lip movements. Supported audio file formats: .mp3/.wav/.m4a/.aac, with a maximum file size of 5MB.
`video`	string	✅ Yes	-	The URL of the video file for generating synchronized lip movements. Video files support .mp4/.mov, file size does not exceed 100MB, video length does not exceed 10s and is not shorter than 2s, only 720p and 1080p are supported, length and width dimensions should both be between 720px and 1920px.

Usage Notes

The video should contain a clear, visible face for optimal lip synchronization
Audio duration should match or be shorter than the video duration
Processing time varies based on video length and complexity
Use the task ID returned in the response to query the generation status via the Query Task endpoint

API Reference

​Parameters

​Usage Notes

Parameters

Usage Notes