Overview
Kling LipSync converts audio into talking head video by generating lifelike lip movements perfectly synced to the input audio. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.Authentication
This endpoint requires authentication using a Bearer token.Your API key in the format:
YOUR_API_KEYRequest Body
The URL pointing to the audio file that will be used for generating synchronized lip movements. Supported audio file formats: .mp3/.wav/.m4a/.aac, with a maximum file size of 5MB.
The URL of the video file for generating synchronized lip movements. Video files support .mp4/.mov, file size does not exceed 100MB, video length does not exceed 10s and is not shorter than 2s, only 720p and 1080p are supported, length and width dimensions should both be between 720px and 1920px.
Request Example
Response
Unique identifier for the prediction, Task Id
Status of the task: created, processing, completed, or failed
Usage Notes
- The video should contain a clear, visible face for optimal lip synchronization
- Audio duration should match or be shorter than the video duration
- Processing time varies based on video length and complexity
- Use the task ID returned in the response to query the generation status via the Query Task endpoint

