Voice Clone - GPTProto API Documentation

Voice Clone

curl --request POST \
  --url https://gptproto.com/api/v3/minimax/voice-clone \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: application/json' \
  --data '{
  "audio": "<string>",
  "custom_voice_id": "<string>",
  "model": "<string>",
  "need_noise_reduction": "<string>",
  "need_volume_normalization": true,
  "accuracy": 123,
  "text": "<string>"
}'

{
  "error": {
    "message": "Invalid signature",
    "type": "401"
  }
}

POST

api

minimax

voice-clone

Voice Clone

curl --request POST \
  --url https://gptproto.com/api/v3/minimax/voice-clone \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: application/json' \
  --data '{
  "audio": "<string>",
  "custom_voice_id": "<string>",
  "model": "<string>",
  "need_noise_reduction": "<string>",
  "need_volume_normalization": true,
  "accuracy": 123,
  "text": "<string>"
}'

{
  "error": {
    "message": "Invalid signature",
    "type": "401"
  }
}

Overview

MiniMax Voice Clone is a state-of-the-art voice synthesis model developed by MiniMax. It enables high-quality voice cloning from a short reference clip, producing speech that closely mimics the tone, accent, and personality of the original speaker.

Authentication

This endpoint requires authentication using a Bearer token.

Authorization

string

default:"sk-***********"

required

Your API key in the format: YOUR_API_KEY

Request Body

audio

string

The uploaded file is cloned and supports formats such as MP3, M4A, and WAV.

custom_voice_id

string

required

Custom user-defined ID. Minimum 8 characters; must include letters and numbers and start with a letter (e.g., gptproto0001). Duplicate voice-ids will throw an error.

model

string

default:"speech-2.5-hd-preview-voice-clone"

required

Specify the TTS model to be used for the preview. This is only a preview after cloning. Once the model is generated, any Minimax Turbo or HD voice model can be used for inference.

need_noise_reduction

string

Enable noise reduction. Default is false (no noise reduction).

need_volume_normalization

boolean

default:"false"

Specify whether to enable volume normalization. If not provided, the default value is false.

accuracy

number

default:"0.7"

0.00 ~ 1.00 .Uploading this parameter will set the text validation accuracy threshold, with a value range of [0,1]. If not provided, the default value for this parameter is 0.7.

text

string

Text for audio preview. Limited to 2000 characters.

Request Example

curl -X POST "https://gptproto.com/api/v3/minimax/voice-clone" \
  -H "Authorization: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "accuracy": 0.7,
  "audio": "https://d1q70pf5vjeyhc.cloudfront.net/media/92d2d4ca66f84793adcb20742b15d262/audios/1752727142784562094_VRSOK53Y.mp3",
  "custom_voice_id": "Alice-jiuhao-2",
  "model": "speech-2.5-hd-preview-voice-clone",
  "need_noise_reduction": false,
  "need_volume_normalization": false,
  "text": "This is a preview of your cloned voice. I hope you enjoy it!"
}'

Response

{
  "error": {
    "message": "Invalid signature",
    "type": "401"
  }
}

Query Task Query Task

⌘I

API Reference

​Overview

​Authentication

​Request Body

​Request Example

​Response

Overview

Authentication

Request Body

Request Example

Response