feat: 接入 Vidu TTS/复刻/对口型,替换 MiniMax 语音能力
- 新增 ViduProvider: TTS同步、声音复刻、对口型、任务查询 - 新增 ViduTTSService: 业务封装,6个精选中文预设音色 - Voice API 路由全面切换至 Vidu - 新增 /voice/lip-sync 对口型异步接口 - 前端适配: 16个音色→6个、slider范围更新、音量默认0 - 添加 vidu-tts-api.md 开发文档 - docker-compose 新增 VIDU_API_KEY 环境变量映射
This commit is contained in:
@@ -0,0 +1,290 @@
|
||||
# Vidu TTS API 开发文档
|
||||
|
||||
> 来源:https://platform.vidu.cn/docs/text-to-speech
|
||||
> 更新时间:2026-04-21
|
||||
|
||||
## 一、概述
|
||||
|
||||
Vidu(生数科技)提供语音合成(TTS)和声音复刻能力,所有接口均为**同步接口**,直接返回结果,无需轮询。
|
||||
|
||||
- **Base URL**: `https://api.vidu.cn`
|
||||
- **认证方式**: `Authorization: Token {your_api_key}`
|
||||
- **Content-Type**: `application/json`
|
||||
|
||||
---
|
||||
|
||||
## 二、语音合成 TTS
|
||||
|
||||
### 端点
|
||||
|
||||
```
|
||||
POST /ent/v2/audio-tts
|
||||
```
|
||||
|
||||
### 请求头
|
||||
|
||||
| 字段 | 值 | 描述 |
|
||||
|------|-----|------|
|
||||
| Content-Type | application/json | 数据交换格式 |
|
||||
| Authorization | Token {your_api_key} | API Key 认证 |
|
||||
|
||||
### 请求体
|
||||
|
||||
| 参数名称 | 类型 | 必填 | 描述 |
|
||||
|----------|------|------|------|
|
||||
| text | String | 是 | 待合成文本,**< 10000 字符**。支持 `<#x#>` 停顿标记,x 为停顿时长(秒),范围 [0.01, 99.99] |
|
||||
| voice_setting_voice_id | String | 是 | 音色 ID |
|
||||
| voice_setting_speed | Float | 否 | 语速,默认 1.0,范围 [0.5, 2] |
|
||||
| voice_setting_volume | Int | 否 | 音量,默认 0(正常音量),范围 [0, 10],值越大音量越高 |
|
||||
| voice_setting_pitch | Int | 否 | 语调,默认 0(原音色),范围 [-12, 12] |
|
||||
| voice_setting_emotion | String | 否 | 情绪控制:`happy`/`sad`/`angry`/`fearful`/`disgusted`/`surprised`/`calm`。一般无需手动指定,模型自动匹配 |
|
||||
| pronunciation_dict_tone | list | 否 | 多音字发音定义,如 `["燕少飞/(yan4)(shao3)(fei1)"]` |
|
||||
| payload | String | 否 | 透传参数,最多 1048576 字符 |
|
||||
|
||||
### 响应体
|
||||
|
||||
```json
|
||||
{
|
||||
"task_id": "your_task_id_here",
|
||||
"state": "success",
|
||||
"file_url": "https://...",
|
||||
"credits": 10,
|
||||
"payload": "",
|
||||
"created_at": "2025-01-01T15:41:31.968916Z"
|
||||
}
|
||||
```
|
||||
|
||||
| 字段 | 类型 | 描述 |
|
||||
|------|------|------|
|
||||
| task_id | String | Vidu 生成的任务 ID |
|
||||
| state | String | `queueing` / `success` / `failed` |
|
||||
| file_url | String | 音频文件 URL |
|
||||
| credits | Int | 本次调用消耗的积分数 |
|
||||
| payload | String | 透传参数 |
|
||||
| created_at | String | 任务创建时间 |
|
||||
|
||||
### Curl 示例
|
||||
|
||||
```bash
|
||||
curl -X POST https://api.vidu.cn/ent/v2/audio-tts \
|
||||
-H "Authorization: Token {your_api_key}" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"text": "你好,欢迎使用vidu开放平台",
|
||||
"voice_setting_voice_id": "female-tianmei"
|
||||
}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 三、声音复刻
|
||||
|
||||
### 端点
|
||||
|
||||
```
|
||||
POST /ent/v2/audio-clone
|
||||
```
|
||||
|
||||
### 请求体
|
||||
|
||||
| 参数名称 | 类型 | 必填 | 描述 |
|
||||
|----------|------|------|------|
|
||||
| audio_url | String | 是 | 原音频 URL(需可访问)。格式:mp3/m4a/wav;时长:10秒~5分钟;大小:≤20MB |
|
||||
| voice_id | String | 是 | 自定义声音 ID。长度 [8, 256];首字符必须为英文字母;允许数字、字母、横线、下划线;末位不可为 `-`、`_`;不可与已有 ID 重复 |
|
||||
| prompt_audio_url | String | 否 | 音色复刻示例音频(< 8秒),可增强音色相似度和稳定性 |
|
||||
| prompt_text | String | 否 | 示例音频对应文本,需与音频内容一致,句末需有标点 |
|
||||
| text | String | 是 | 复刻试听文本,≤1000 字符。使用复刻后的音色朗读,返回试听音频 |
|
||||
| payload | String | 否 | 透传参数 |
|
||||
|
||||
### 响应体
|
||||
|
||||
```json
|
||||
{
|
||||
"task_id": "your_task_id_here",
|
||||
"state": "success",
|
||||
"voice_id": "vidu01",
|
||||
"demo_audio": "https://...",
|
||||
"payload": "",
|
||||
"created_at": "2025-01-01T15:41:31.968916Z"
|
||||
}
|
||||
```
|
||||
|
||||
| 字段 | 类型 | 描述 |
|
||||
|------|------|------|
|
||||
| task_id | String | 任务 ID |
|
||||
| state | String | `queueing` / `success` / `failed` |
|
||||
| voice_id | String | 用户自定义的 voice_id(任务失败时不返回)|
|
||||
| demo_audio | String | 试听音频链接(仅当请求传入 text 时返回)|
|
||||
| payload | String | 透传参数 |
|
||||
| created_at | String | 创建时间 |
|
||||
|
||||
### Curl 示例
|
||||
|
||||
```bash
|
||||
curl -X POST https://api.vidu.cn/ent/v2/audio-clone \
|
||||
-H "Authorization: Token {your_api_key}" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"audio_url": "your_audio_url",
|
||||
"voice_id": "vidu01",
|
||||
"text": "你好,欢迎使用vidu开放平台"
|
||||
}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 四、预设音色列表
|
||||
|
||||
共 **16 个中文(普通话)**音色,分标准版和 Beta(精品)版。
|
||||
|
||||
### 标准版
|
||||
|
||||
| voice_id | 音色名称 |
|
||||
|----------|----------|
|
||||
| male-qn-qingse | 青涩青年音色 |
|
||||
| male-qn-jingying | 精英青年音色 |
|
||||
| male-qn-badao | 霸道青年音色 |
|
||||
| male-qn-daxuesheng | 青年大学生音色 |
|
||||
| female-shaonv | 少女音色 |
|
||||
| female-yujie | 御姐音色 |
|
||||
| female-chengshu | 成熟女性音色 |
|
||||
| female-tianmei | 甜美女性音色 |
|
||||
|
||||
### Beta(精品)版
|
||||
|
||||
| voice_id | 音色名称 |
|
||||
|----------|----------|
|
||||
| male-qn-qingse-jingpin | 青涩青年音色-beta |
|
||||
| male-qn-jingying-jingpin | 精英青年音色-beta |
|
||||
| male-qn-badao-jingpin | 霸道青年音色-beta |
|
||||
| male-qn-daxuesheng-jingpin | 青年大学生音色-beta |
|
||||
| female-shaonv-jingpin | 少女音色-beta |
|
||||
| female-yujie-jingpin | 御姐音色-beta |
|
||||
| female-chengshu-jingpin | 成熟女性音色-beta |
|
||||
| female-tianmei-jingpin | 甜美女性音色-beta |
|
||||
|
||||
> 音色试听示例 URL 格式:`https://scene.vidu.zone/media-asset/{id}.mp3`(见飞书表格原始链接)
|
||||
|
||||
---
|
||||
|
||||
## 五、与 MiniMax 对比(接入参考)
|
||||
|
||||
| 维度 | Vidu | MiniMax |
|
||||
|------|------|---------|
|
||||
| Base URL | `https://api.vidu.cn` | `https://api.minimaxi.com` |
|
||||
| 认证 | `Token {key}` | `Bearer {key}` |
|
||||
| TTS 端点 | `POST /ent/v2/audio-tts` | `POST /v1/t2a_v2` |
|
||||
| 同步/异步 | 同步 | 同步 + 异步 |
|
||||
| 文本上限 | 10000 字符 | 10000 字符(同步)|
|
||||
| 语速范围 | 0.5 ~ 2.0 (Float) | 需传 Int |
|
||||
| 音量范围 | 0 ~ 10 (Int,0=正常) | 需传 Int |
|
||||
| 语调范围 | -12 ~ 12 (Int) | 需传 Int |
|
||||
| 情绪控制 | 7 种情绪可选 | 不支持 |
|
||||
| 多音字 | 支持 `pronunciation_dict_tone` | 不支持 |
|
||||
| 声音复刻 | 同步,自定义 voice_id | 异步,系统分配 voice_id |
|
||||
| 复刻音频要求 | 10秒~5分钟,≤20MB | 约 10秒~5分钟 |
|
||||
| 预设音色 | 16 个中文 | 6 个中文 |
|
||||
| 响应音频字段 | `file_url` | `audio` |
|
||||
|
||||
---
|
||||
|
||||
## 六、对口型(Lip Sync)
|
||||
|
||||
### 端点
|
||||
|
||||
```
|
||||
POST /ent/v2/lip-sync
|
||||
```
|
||||
|
||||
**⚠️ 异步接口**,创建后返回 task_id,需要通过查询接口轮询或使用 callback_url 接收回调。
|
||||
|
||||
### 请求体
|
||||
|
||||
| 参数名称 | 类型 | 必填 | 描述 |
|
||||
|----------|------|------|------|
|
||||
| video_url | String | 是 | 原视频 URL(需可访问)。格式:mp4/mov/avi;时长:1~600秒(建议 10~120秒);大小:≤5G;分辨率:360p~4096p;编码:H.264 |
|
||||
| audio_url | String | 否 | 音频文件 URL。格式:wav/mp3/wma/m4a/aac/ogg;时长:>1s 且 <600s;大小:≤100MB |
|
||||
| text | String | 否 | 文本内容,4~2000 字符。与 audio_url 同时有值时,以 audio_url 为准 |
|
||||
| speed | Float | 否 | 语速,默认 1.0,范围 [0.5, 2]。仅文字生成时生效 |
|
||||
| voice_id | String | 否 | 音色 ID。仅文字生成时生效 |
|
||||
| volume | Int | 否 | 音量,默认 0(正常音量),范围 [0, 10]。仅文字生成时生效 |
|
||||
| ref_photo_url | String | 否 | 人脸参考图 URL(jpg/jpeg/png/bmp/webp,192~4096px,≤10MB)。视频中有多张人脸时,用于指定对口型目标人物 |
|
||||
| callback_url | String | 否 | 回调地址,任务状态变化时 POST 回调 |
|
||||
|
||||
### 视频素材规范
|
||||
|
||||
- 真人出镜(卡通人物需五官比例接近真人)
|
||||
- 人脸正对镜头,水平转动不超过 45°,俯仰不超过 15°
|
||||
- 人脸尽量不遮挡,面部光线稳定
|
||||
|
||||
### 创建响应
|
||||
|
||||
```json
|
||||
{
|
||||
"task_id": "your_task_id_here",
|
||||
"state": "created",
|
||||
"payload": "",
|
||||
"created_at": "2025-01-01T15:41:31.968916Z"
|
||||
}
|
||||
```
|
||||
|
||||
### 查询任务状态
|
||||
|
||||
```
|
||||
GET /ent/v2/tasks/{task_id}/creations
|
||||
```
|
||||
|
||||
**响应体**:
|
||||
|
||||
| 字段 | 类型 | 描述 |
|
||||
|------|------|------|
|
||||
| id | String | 任务 ID |
|
||||
| state | String | `created`/`queueing`/`processing`/`success`/`failed` |
|
||||
| err_code | String | 错误码 |
|
||||
| credits | Int | 消耗的积分数 |
|
||||
| payload | String | 透传参数 |
|
||||
| bgm | Bool | 是否使用 BGM |
|
||||
| off_peak | Bool | 是否使用错峰模式 |
|
||||
| creations | Array | 生成物结果列表 |
|
||||
| creations[].id | String | 生成物 ID |
|
||||
| creations[].url | String | 生成物 URL(24小时有效期) |
|
||||
| creations[].cover_url | String | 生成物封面 URL(24小时有效期) |
|
||||
| creations[].watermarked_url | String | 带水印的生成物 URL |
|
||||
|
||||
### Curl 示例(音频驱动)
|
||||
|
||||
```bash
|
||||
curl -X POST https://api.vidu.cn/ent/v2/lip-sync \
|
||||
-H "Authorization: Token {your_api_key}" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"video_url": "your_video_url",
|
||||
"audio_url": "your_audio_url"
|
||||
}'
|
||||
```
|
||||
|
||||
### Curl 示例(文字驱动)
|
||||
|
||||
```bash
|
||||
curl -X POST https://api.vidu.cn/ent/v2/lip-sync \
|
||||
-H "Authorization: Token {your_api_key}" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"video_url": "your_video_url",
|
||||
"text": "你好,欢迎使用vidu开放平台",
|
||||
"voice_id": "female-tianmei",
|
||||
"speed": 1.0
|
||||
}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 七、接入建议
|
||||
|
||||
1. **Vidu 优势**:情绪控制、多音字标注、16 个音色(含精品版)、同步复刻、对口型
|
||||
2. **Vidu 劣势**:没有独立的"查询音色列表"API,音色通过飞书表格维护
|
||||
3. **接口类型差异**:
|
||||
- TTS / 声音复刻:**同步接口**,直接返回结果
|
||||
- 对口型:**异步接口**,需轮询 `GET /tasks/{id}/creations` 或使用 callback
|
||||
4. **速度/音量/音调类型**:Vidu 的速度是 **Float**,音量和音调是 **Int**(和 MiniMax 不同,MiniMax 三者都要求 Int)
|
||||
5. **前端适配**:语速 slider 范围改为 0.5~2.0;音量改为 0~10;音调改为 -12~12
|
||||
@@ -0,0 +1,184 @@
|
||||
"""
|
||||
Vidu API Provider
|
||||
=================
|
||||
|
||||
封装 Vidu 语音/视频相关 HTTP API:
|
||||
- 同步 TTS(/ent/v2/audio-tts)
|
||||
- 声音复刻(/ent/v2/audio-clone)
|
||||
- 对口型(/ent/v2/lip-sync)
|
||||
- 查询任务(/ent/v2/tasks/{id}/creations)
|
||||
|
||||
认证方式:Token {api_key}(Authorization Header)
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import Any
|
||||
|
||||
import aiohttp
|
||||
|
||||
from app.config import get_settings
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class ViduProvider:
|
||||
"""Vidu API 客户端封装"""
|
||||
|
||||
def __init__(self, api_key: str | None = None, base_url: str | None = None):
|
||||
settings = get_settings()
|
||||
self.api_key = api_key or settings.VIDU_API_KEY
|
||||
self.base_url = (base_url or settings.VIDU_BASE_URL).rstrip("/")
|
||||
|
||||
def _get_headers(self) -> dict[str, str]:
|
||||
return {
|
||||
"Authorization": f"Token {self.api_key}",
|
||||
"Content-Type": "application/json",
|
||||
}
|
||||
|
||||
# ==================== TTS 语音合成 ====================
|
||||
|
||||
async def tts_sync(
|
||||
self,
|
||||
text: str,
|
||||
voice_id: str,
|
||||
speed: float = 1.0,
|
||||
volume: int = 0,
|
||||
pitch: int = 0,
|
||||
emotion: str | None = None,
|
||||
pronunciation_dict_tone: list[str] | None = None,
|
||||
payload: str | None = None,
|
||||
) -> dict[str, Any]:
|
||||
"""
|
||||
同步语音合成
|
||||
|
||||
POST /ent/v2/audio-tts
|
||||
"""
|
||||
url = f"{self.base_url}/ent/v2/audio-tts"
|
||||
|
||||
body: dict[str, Any] = {
|
||||
"text": text,
|
||||
"voice_setting_voice_id": voice_id,
|
||||
"voice_setting_speed": speed,
|
||||
"voice_setting_volume": volume,
|
||||
"voice_setting_pitch": pitch,
|
||||
}
|
||||
if emotion:
|
||||
body["voice_setting_emotion"] = emotion
|
||||
if pronunciation_dict_tone:
|
||||
body["pronunciation_dict_tone"] = pronunciation_dict_tone
|
||||
if payload:
|
||||
body["payload"] = payload
|
||||
|
||||
async with aiohttp.ClientSession() as session:
|
||||
async with session.post(url, json=body, headers=self._get_headers()) as resp:
|
||||
data = await resp.json()
|
||||
if resp.status != 200 or data.get("state") == "failed":
|
||||
msg = data.get("err_code") or data.get("message") or f"HTTP {resp.status}"
|
||||
raise Exception(f"Vidu TTS error: {msg}")
|
||||
return data
|
||||
|
||||
# ==================== 声音复刻 ====================
|
||||
|
||||
async def clone_voice(
|
||||
self,
|
||||
audio_url: str,
|
||||
voice_id: str,
|
||||
text: str,
|
||||
prompt_audio_url: str | None = None,
|
||||
prompt_text: str | None = None,
|
||||
payload: str | None = None,
|
||||
) -> dict[str, Any]:
|
||||
"""
|
||||
声音复刻(同步接口)
|
||||
|
||||
POST /ent/v2/audio-clone
|
||||
"""
|
||||
url = f"{self.base_url}/ent/v2/audio-clone"
|
||||
|
||||
body: dict[str, Any] = {
|
||||
"audio_url": audio_url,
|
||||
"voice_id": voice_id,
|
||||
"text": text,
|
||||
}
|
||||
if prompt_audio_url:
|
||||
body["prompt_audio_url"] = prompt_audio_url
|
||||
if prompt_text:
|
||||
body["prompt_text"] = prompt_text
|
||||
if payload:
|
||||
body["payload"] = payload
|
||||
|
||||
async with aiohttp.ClientSession() as session:
|
||||
async with session.post(url, json=body, headers=self._get_headers()) as resp:
|
||||
data = await resp.json()
|
||||
if resp.status != 200 or data.get("state") == "failed":
|
||||
msg = data.get("err_code") or data.get("message") or f"HTTP {resp.status}"
|
||||
raise Exception(f"Vidu clone error: {msg}")
|
||||
return data
|
||||
|
||||
# ==================== 对口型 ====================
|
||||
|
||||
async def lip_sync(
|
||||
self,
|
||||
video_url: str,
|
||||
audio_url: str | None = None,
|
||||
text: str | None = None,
|
||||
voice_id: str | None = None,
|
||||
speed: float = 1.0,
|
||||
volume: int = 0,
|
||||
ref_photo_url: str | None = None,
|
||||
callback_url: str | None = None,
|
||||
payload: str | None = None,
|
||||
) -> dict[str, Any]:
|
||||
"""
|
||||
对口型(异步接口)
|
||||
|
||||
POST /ent/v2/lip-sync
|
||||
"""
|
||||
url = f"{self.base_url}/ent/v2/lip-sync"
|
||||
|
||||
body: dict[str, Any] = {"video_url": video_url}
|
||||
|
||||
if audio_url:
|
||||
body["audio_url"] = audio_url
|
||||
if text:
|
||||
body["text"] = text
|
||||
if voice_id:
|
||||
body["voice_id"] = voice_id
|
||||
if speed != 1.0:
|
||||
body["speed"] = speed
|
||||
if volume != 0:
|
||||
body["volume"] = volume
|
||||
if ref_photo_url:
|
||||
body["ref_photo_url"] = ref_photo_url
|
||||
if callback_url:
|
||||
body["callback_url"] = callback_url
|
||||
if payload:
|
||||
body["payload"] = payload
|
||||
|
||||
async with aiohttp.ClientSession() as session:
|
||||
async with session.post(url, json=body, headers=self._get_headers()) as resp:
|
||||
data = await resp.json()
|
||||
if resp.status != 200 or data.get("state") == "failed":
|
||||
msg = data.get("err_code") or data.get("message") or f"HTTP {resp.status}"
|
||||
raise Exception(f"Vidu lip-sync error: {msg}")
|
||||
return data
|
||||
|
||||
# ==================== 查询任务 ====================
|
||||
|
||||
async def query_task(self, task_id: str) -> dict[str, Any]:
|
||||
"""
|
||||
查询任务状态及生成物
|
||||
|
||||
GET /ent/v2/tasks/{task_id}/creations
|
||||
"""
|
||||
url = f"{self.base_url}/ent/v2/tasks/{task_id}/creations"
|
||||
|
||||
async with aiohttp.ClientSession() as session:
|
||||
async with session.get(url, headers=self._get_headers()) as resp:
|
||||
data = await resp.json()
|
||||
if resp.status != 200:
|
||||
msg = data.get("err_code") or data.get("message") or f"HTTP {resp.status}"
|
||||
raise Exception(f"Vidu query task error: {msg}")
|
||||
return data
|
||||
+277
-70
@@ -3,19 +3,24 @@
|
||||
=======================
|
||||
|
||||
提供 TTS 语音合成、批量合成、声音克隆等功能。
|
||||
基于 Kling AI TTS 和声音克隆 API。
|
||||
基于 MiniMax TTS 和声音克隆 API。
|
||||
(Kling AI 语音相关代码保留但已废弃,仅视频/形象克隆仍使用 Kling)
|
||||
"""
|
||||
|
||||
import logging
|
||||
import tempfile
|
||||
import uuid
|
||||
from pathlib import Path
|
||||
|
||||
from fastapi import APIRouter, HTTPException
|
||||
from fastapi import APIRouter, File, Form, HTTPException, UploadFile
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from app.schemas.common import ApiResponse, success_response
|
||||
from app.services.tts_service import TTSService
|
||||
from app.services.voice_clone_service import VoiceCloneService
|
||||
from app.services.qiniu_service import QiniuService
|
||||
from app.services.vidu_tts_service import ViduTTSService
|
||||
from app.services.minimax_tts_service import MiniMaxTTSService # noqa: F401 历史兼容
|
||||
from app.services.tts_service import TTSService # noqa: F401 历史兼容
|
||||
from app.services.voice_clone_service import VoiceCloneService # noqa: F401 历史兼容
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
router = APIRouter(prefix="/voice", tags=["Voice"])
|
||||
@@ -27,10 +32,12 @@ router = APIRouter(prefix="/voice", tags=["Voice"])
|
||||
class TTSSynthesizeRequest(BaseModel):
|
||||
"""TTS 合成请求"""
|
||||
|
||||
text: str = Field(..., min_length=1, max_length=1000, description="待合成文本(≤1000字)")
|
||||
voice_id: str | None = Field(None, description="音色 ID(默认:温柔女声)")
|
||||
speed: float = Field(default=1.0, ge=0.8, le=2.0, description="语速 0.8-2.0")
|
||||
text: str = Field(..., min_length=1, max_length=10000, description="待合成文本(≤10000字符)")
|
||||
voice_id: str | None = Field(None, description="音色 ID(默认:甜美女性)")
|
||||
speed: float = Field(default=1.0, ge=0.5, le=2.0, description="语速 0.5-2.0")
|
||||
voice_language: str = Field(default="zh", description="音色语种 (zh/en)")
|
||||
volume: int = Field(default=0, ge=0, le=10, description="音量 0-10(0=正常)")
|
||||
pitch: int = Field(default=0, ge=-12, le=12, description="音调 -12 到 12")
|
||||
|
||||
|
||||
class TTSBatchSegment(BaseModel):
|
||||
@@ -46,7 +53,9 @@ class TTSBatchRequest(BaseModel):
|
||||
|
||||
segments: list[TTSBatchSegment] = Field(..., min_length=1, description="段落列表")
|
||||
voice_id: str | None = Field(None, description="音色 ID")
|
||||
speed: float = Field(default=1.0, ge=0.8, le=2.0, description="语速")
|
||||
speed: float = Field(default=1.0, ge=0.5, le=2.0, description="语速")
|
||||
volume: int = Field(default=0, ge=0, le=10, description="音量 0-10")
|
||||
pitch: int = Field(default=0, ge=-12, le=12, description="音调 -12 到 12")
|
||||
|
||||
|
||||
class VoiceCloneSubmitRequest(BaseModel):
|
||||
@@ -77,6 +86,13 @@ class VoiceCloneTaskResponse(BaseModel):
|
||||
error_message: str | None = None
|
||||
|
||||
|
||||
class VoiceUploadResponse(BaseModel):
|
||||
"""音频上传响应"""
|
||||
|
||||
url: str = Field(..., description="七牛云访问 URL")
|
||||
key: str = Field(..., description="存储 Key")
|
||||
|
||||
|
||||
class VoiceInfo(BaseModel):
|
||||
"""音色信息"""
|
||||
|
||||
@@ -85,11 +101,109 @@ class VoiceInfo(BaseModel):
|
||||
description: str = ""
|
||||
language: str = "zh"
|
||||
recommended: bool = False
|
||||
previewUrl: str | None = None
|
||||
|
||||
|
||||
class LipSyncRequest(BaseModel):
|
||||
"""对口型请求"""
|
||||
|
||||
video_url: str = Field(..., description="原视频 URL")
|
||||
audio_url: str | None = Field(None, description="音频 URL(与 text 二选一)")
|
||||
text: str | None = Field(None, description="文本内容(与 audio_url 二选一)")
|
||||
voice_id: str | None = Field(None, description="音色 ID(文字驱动时生效)")
|
||||
speed: float = Field(default=1.0, ge=0.5, le=2.0, description="语速")
|
||||
volume: int = Field(default=0, ge=0, le=10, description="音量")
|
||||
ref_photo_url: str | None = Field(None, description="人脸参考图 URL")
|
||||
|
||||
|
||||
class LipSyncResponse(BaseModel):
|
||||
"""对口型响应"""
|
||||
|
||||
task_id: str
|
||||
state: str
|
||||
|
||||
|
||||
class LipSyncQueryResponse(BaseModel):
|
||||
"""对口型查询响应"""
|
||||
|
||||
task_id: str
|
||||
state: str
|
||||
video_url: str | None = None
|
||||
cover_url: str | None = None
|
||||
|
||||
|
||||
# ========== API 路由 ==========
|
||||
|
||||
|
||||
@router.post("/upload", response_model=ApiResponse[VoiceUploadResponse])
|
||||
async def upload_voice_file(
|
||||
file: UploadFile = File(...),
|
||||
file_type: str = Form(default="audio", description="文件类型: audio | video"),
|
||||
):
|
||||
"""
|
||||
上传音频/视频文件到七牛云
|
||||
|
||||
接收音频(mp3/wav)或视频(mp4/mov)文件,上传至七牛云 media bucket,
|
||||
返回公开访问 URL。
|
||||
"""
|
||||
try:
|
||||
file_type = file_type.lower().strip()
|
||||
if file_type not in ("audio", "video"):
|
||||
raise HTTPException(status_code=400, detail="file_type 必须是 audio 或 video")
|
||||
|
||||
# 根据类型校验 MIME
|
||||
if file_type == "audio":
|
||||
allowed_types = {"audio/mpeg", "audio/mp3", "audio/wav"}
|
||||
max_size = 50 * 1024 * 1024 # 50MB
|
||||
prefix = "meijiaka-zj/voice"
|
||||
type_label = "音频"
|
||||
else:
|
||||
allowed_types = {"video/mp4", "video/quicktime"}
|
||||
max_size = 200 * 1024 * 1024 # 200MB
|
||||
prefix = "meijiaka-zj/avatar"
|
||||
type_label = "视频"
|
||||
|
||||
content_type = file.content_type or "application/octet-stream"
|
||||
if content_type not in allowed_types:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail=f"不支持的{type_label}格式: {content_type},仅支持 {', '.join(allowed_types)}",
|
||||
)
|
||||
|
||||
# 读取文件内容
|
||||
content = await file.read()
|
||||
if len(content) > max_size:
|
||||
raise HTTPException(status_code=400, detail=f"{type_label}文件大小不能超过 {max_size // 1024 // 1024}MB")
|
||||
|
||||
# 生成存储 key
|
||||
ext = content_type.split("/")[-1].replace("quicktime", "mov").replace("mpeg", "mp3")
|
||||
key = f"{prefix}/{uuid.uuid4().hex}.{ext}"
|
||||
|
||||
# 上传到七牛云
|
||||
qiniu = QiniuService()
|
||||
from io import BytesIO
|
||||
|
||||
qiniu.upload_stream(
|
||||
stream=BytesIO(content),
|
||||
key=key,
|
||||
mime_type=content_type,
|
||||
)
|
||||
|
||||
# 获取公开 URL(media bucket 使用 video_domain)
|
||||
url = qiniu.get_file_url(qiniu.video_domain, key)
|
||||
|
||||
return success_response(
|
||||
data=VoiceUploadResponse(url=url, key=key),
|
||||
message="上传成功",
|
||||
)
|
||||
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"[Voice] 上传失败: {e}")
|
||||
raise HTTPException(status_code=500, detail=f"上传失败: {str(e)}")
|
||||
|
||||
|
||||
@router.get("/voices", response_model=ApiResponse[list[VoiceInfo]])
|
||||
async def list_voices():
|
||||
"""
|
||||
@@ -97,13 +211,26 @@ async def list_voices():
|
||||
|
||||
返回预设的音色选项,用户可选择喜欢的音色进行 TTS 合成。
|
||||
"""
|
||||
voices = TTSService.get_preset_voices()
|
||||
voices = ViduTTSService.get_preset_voices()
|
||||
return success_response(
|
||||
data=[VoiceInfo(**v) for v in voices],
|
||||
message="获取音色列表成功",
|
||||
)
|
||||
|
||||
|
||||
@router.get("/preset-voices/raw", response_model=ApiResponse[list[dict]])
|
||||
async def list_preset_voices_raw():
|
||||
"""
|
||||
【已废弃】KlingAI 官方预置音色列表
|
||||
|
||||
语音功能已迁移至 Vidu,此端点保留仅作历史兼容。
|
||||
"""
|
||||
return success_response(
|
||||
data=[],
|
||||
message="语音功能已迁移至 Vidu,请使用 /voices 获取音色列表",
|
||||
)
|
||||
|
||||
|
||||
@router.post("/synthesize", response_model=ApiResponse[dict])
|
||||
async def synthesize_speech(request: TTSSynthesizeRequest):
|
||||
"""
|
||||
@@ -113,12 +240,13 @@ async def synthesize_speech(request: TTSSynthesizeRequest):
|
||||
适用于短文本(≤1000字),长文本建议使用 /synthesize-batch。
|
||||
"""
|
||||
try:
|
||||
service = TTSService()
|
||||
service = ViduTTSService()
|
||||
audio_url = await service.synthesize_sync(
|
||||
text=request.text,
|
||||
voice_id=request.voice_id,
|
||||
speed=request.speed,
|
||||
voice_language=request.voice_language,
|
||||
volume=request.volume,
|
||||
pitch=request.pitch,
|
||||
)
|
||||
|
||||
return success_response(
|
||||
@@ -126,7 +254,7 @@ async def synthesize_speech(request: TTSSynthesizeRequest):
|
||||
"audio_url": audio_url,
|
||||
"format": "mp3",
|
||||
"text": request.text,
|
||||
"voice_id": request.voice_id or "829826751244537879",
|
||||
"voice_id": request.voice_id or ViduTTSService.DEFAULT_VOICE_ID,
|
||||
},
|
||||
message="合成成功",
|
||||
)
|
||||
@@ -154,13 +282,31 @@ async def synthesize_batch(request: TTSBatchRequest):
|
||||
|
||||
segments_data = [s.model_dump() for s in request.segments]
|
||||
|
||||
service = TTSService()
|
||||
results = await service.batch_synthesize(
|
||||
segments=segments_data,
|
||||
output_dir=output_dir,
|
||||
voice_id=request.voice_id,
|
||||
speed=request.speed,
|
||||
)
|
||||
service = ViduTTSService()
|
||||
# Vidu 暂不支持批量合成,逐段调用
|
||||
results = []
|
||||
for seg in segments_data:
|
||||
try:
|
||||
audio_url = await service.synthesize_sync(
|
||||
text=seg["text"],
|
||||
voice_id=request.voice_id,
|
||||
speed=request.speed,
|
||||
volume=request.volume,
|
||||
pitch=request.pitch,
|
||||
)
|
||||
results.append({
|
||||
"index": seg.get("index", 0),
|
||||
"success": True,
|
||||
"audio_url": audio_url,
|
||||
"filename": seg.get("filename"),
|
||||
})
|
||||
except Exception as e:
|
||||
results.append({
|
||||
"index": seg.get("index", 0),
|
||||
"success": False,
|
||||
"error": str(e),
|
||||
"filename": seg.get("filename"),
|
||||
})
|
||||
|
||||
success_count = sum(1 for r in results if r["success"])
|
||||
failed_count = len(results) - success_count
|
||||
@@ -188,20 +334,28 @@ async def synthesize_to_file(request: TTSSynthesizeRequest, output_path: str):
|
||||
将文本转换为语音并保存到指定文件路径。
|
||||
"""
|
||||
try:
|
||||
service = TTSService()
|
||||
saved_path = await service.synthesize_to_file(
|
||||
service = ViduTTSService()
|
||||
audio_url = await service.synthesize_sync(
|
||||
text=request.text,
|
||||
output_path=output_path,
|
||||
voice_id=request.voice_id,
|
||||
speed=request.speed,
|
||||
voice_language=request.voice_language,
|
||||
volume=request.volume,
|
||||
pitch=request.pitch,
|
||||
)
|
||||
|
||||
# 下载音频并保存到指定路径
|
||||
import httpx
|
||||
async with httpx.AsyncClient() as client:
|
||||
response = await client.get(audio_url)
|
||||
response.raise_for_status()
|
||||
Path(output_path).parent.mkdir(parents=True, exist_ok=True)
|
||||
Path(output_path).write_bytes(response.content)
|
||||
|
||||
return success_response(
|
||||
data={
|
||||
"file_path": str(saved_path),
|
||||
"file_path": output_path,
|
||||
"text": request.text,
|
||||
"voice_id": request.voice_id or "829826751244537879",
|
||||
"voice_id": request.voice_id or ViduTTSService.DEFAULT_VOICE_ID,
|
||||
},
|
||||
message="文件保存成功",
|
||||
)
|
||||
@@ -217,26 +371,26 @@ async def synthesize_to_file(request: TTSSynthesizeRequest, output_path: str):
|
||||
@router.post("/clone/submit", response_model=ApiResponse[VoiceCloneTaskResponse])
|
||||
async def submit_clone_task(request: VoiceCloneSubmitRequest):
|
||||
"""
|
||||
提交声音克隆任务
|
||||
提交声音克隆任务(Vidu)
|
||||
|
||||
提交音频/视频 URL 进行声音克隆,返回任务 ID 用于后续查询。
|
||||
支持三种来源:source_audio_url、source_video_url、video_id。
|
||||
Vidu 声音复刻是同步接口,直接返回结果。
|
||||
"""
|
||||
try:
|
||||
service = VoiceCloneService()
|
||||
task_id = await service.submit_clone_task(
|
||||
source_audio_url=request.source_audio_url,
|
||||
source_video_url=request.source_video_url,
|
||||
video_id=request.video_id,
|
||||
voice_name=request.voice_name,
|
||||
service = ViduTTSService()
|
||||
result = await service.clone_voice(
|
||||
audio_url=request.source_audio_url or "",
|
||||
voice_id=request.voice_name or f"vidu_{uuid.uuid4().hex[:8]}",
|
||||
)
|
||||
|
||||
# Vidu 同步返回,状态直接为 succeeded
|
||||
return success_response(
|
||||
data=VoiceCloneTaskResponse(
|
||||
task_id=task_id,
|
||||
status="pending",
|
||||
task_id=result.get("task_id", ""),
|
||||
status="succeeded",
|
||||
voice_id=result.get("voice_id"),
|
||||
trial_url=result.get("demo_audio"),
|
||||
),
|
||||
message="克隆任务已提交",
|
||||
message="克隆成功",
|
||||
)
|
||||
|
||||
except ValueError as e:
|
||||
@@ -250,29 +404,17 @@ async def submit_clone_task(request: VoiceCloneSubmitRequest):
|
||||
@router.get("/clone/query/{task_id}", response_model=ApiResponse[VoiceCloneTaskResponse])
|
||||
async def query_clone_task(task_id: str, blocking: bool = False):
|
||||
"""
|
||||
查询声音克隆任务状态
|
||||
查询声音克隆任务状态(Vidu)
|
||||
|
||||
Args:
|
||||
task_id: 任务 ID
|
||||
blocking: 是否阻塞等待完成(默认 False)
|
||||
Vidu 声音复刻是同步接口,此端点仅做兼容,直接返回成功状态。
|
||||
"""
|
||||
try:
|
||||
service = VoiceCloneService()
|
||||
result = await service.query_clone_task(task_id, blocking=blocking)
|
||||
|
||||
return success_response(
|
||||
data=VoiceCloneTaskResponse(
|
||||
task_id=result["task_id"],
|
||||
status=result["status"],
|
||||
voice_id=result.get("voice_id"),
|
||||
trial_url=result.get("trial_url"),
|
||||
error_message=result.get("error_message"),
|
||||
)
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[Voice] 查询克隆任务失败: {e}")
|
||||
raise HTTPException(status_code=500, detail=f"查询失败: {str(e)}")
|
||||
return success_response(
|
||||
data=VoiceCloneTaskResponse(
|
||||
task_id=task_id,
|
||||
status="succeeded",
|
||||
),
|
||||
message="克隆已完成",
|
||||
)
|
||||
|
||||
|
||||
@router.post("/clone/clone-and-wait", response_model=ApiResponse[VoiceCloneTaskResponse])
|
||||
@@ -284,24 +426,20 @@ async def clone_and_wait(request: VoiceCloneSubmitRequest, poll_interval: float
|
||||
适用于需要等待克隆完成的场景。
|
||||
"""
|
||||
try:
|
||||
service = VoiceCloneService()
|
||||
result = await service.wait_for_clone(
|
||||
source_audio_url=request.source_audio_url,
|
||||
source_video_url=request.source_video_url,
|
||||
video_id=request.video_id,
|
||||
voice_name=request.voice_name,
|
||||
poll_interval=poll_interval,
|
||||
service = ViduTTSService()
|
||||
result = await service.clone_voice(
|
||||
audio_url=request.source_audio_url or "",
|
||||
voice_id=request.voice_name or f"vidu_{uuid.uuid4().hex[:8]}",
|
||||
)
|
||||
|
||||
return success_response(
|
||||
data=VoiceCloneTaskResponse(
|
||||
task_id=result["task_id"],
|
||||
status=result["status"],
|
||||
task_id=result.get("task_id", ""),
|
||||
status="succeeded",
|
||||
voice_id=result.get("voice_id"),
|
||||
trial_url=result.get("trial_url"),
|
||||
error_message=result.get("error_message"),
|
||||
trial_url=result.get("demo_audio"),
|
||||
),
|
||||
message=f"克隆任务完成,状态: {result['status']}",
|
||||
message="克隆成功",
|
||||
)
|
||||
|
||||
except ValueError as e:
|
||||
@@ -312,4 +450,73 @@ async def clone_and_wait(request: VoiceCloneSubmitRequest, poll_interval: float
|
||||
raise HTTPException(status_code=500, detail=f"克隆失败: {str(e)}")
|
||||
|
||||
|
||||
# ==================== 对口型 ====================
|
||||
|
||||
|
||||
@router.post("/lip-sync", response_model=ApiResponse[LipSyncResponse])
|
||||
async def create_lip_sync(request: LipSyncRequest):
|
||||
"""
|
||||
创建对口型任务(异步接口)
|
||||
|
||||
输入视频 + 音频/文字,生成对口型视频。
|
||||
返回 task_id,需通过 /lip-sync/{task_id} 查询结果。
|
||||
"""
|
||||
try:
|
||||
if not request.audio_url and not request.text:
|
||||
raise ValueError("audio_url 和 text 至少传一个")
|
||||
|
||||
service = ViduTTSService()
|
||||
task_id = await service.lip_sync_create(
|
||||
video_url=request.video_url,
|
||||
audio_url=request.audio_url,
|
||||
text=request.text,
|
||||
voice_id=request.voice_id,
|
||||
speed=request.speed,
|
||||
volume=request.volume,
|
||||
ref_photo_url=request.ref_photo_url,
|
||||
)
|
||||
|
||||
return success_response(
|
||||
data=LipSyncResponse(task_id=task_id, state="created"),
|
||||
message="对口型任务已创建",
|
||||
)
|
||||
|
||||
except ValueError as e:
|
||||
logger.warning(f"[Voice] 对口型参数错误: {e}")
|
||||
raise HTTPException(status_code=422, detail=str(e))
|
||||
except Exception as e:
|
||||
logger.error(f"[Voice] 对口型任务创建失败: {e}")
|
||||
raise HTTPException(status_code=500, detail=f"创建失败: {str(e)}")
|
||||
|
||||
|
||||
@router.get("/lip-sync/{task_id}", response_model=ApiResponse[LipSyncQueryResponse])
|
||||
async def query_lip_sync(task_id: str):
|
||||
"""
|
||||
查询对口型任务状态
|
||||
|
||||
返回任务状态及生成物 URL(24小时有效期)。
|
||||
"""
|
||||
try:
|
||||
service = ViduTTSService()
|
||||
result = await service.lip_sync_query(task_id)
|
||||
|
||||
state = result.get("state", "unknown")
|
||||
creations = result.get("creations", [])
|
||||
video_url = creations[0].get("url") if creations else None
|
||||
cover_url = creations[0].get("cover_url") if creations else None
|
||||
|
||||
return success_response(
|
||||
data=LipSyncQueryResponse(
|
||||
task_id=task_id,
|
||||
state=state,
|
||||
video_url=video_url,
|
||||
cover_url=cover_url,
|
||||
),
|
||||
message=f"任务状态: {state}",
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[Voice] 查询对口型任务失败: {e}")
|
||||
raise HTTPException(status_code=500, detail=f"查询失败: {str(e)}")
|
||||
|
||||
|
||||
|
||||
@@ -119,6 +119,20 @@ class Settings(BaseSettings):
|
||||
KLINGAI_ACCESS_KEY: str | None = Field(default=None, description="KlingAI Access Key")
|
||||
KLINGAI_SECRET_KEY: str | None = Field(default=None, description="KlingAI Secret Key")
|
||||
|
||||
# MiniMax 配置
|
||||
MINIMAX_API_KEY: str | None = Field(default=None, description="MiniMax API Key")
|
||||
MINIMAX_BASE_URL: str = Field(
|
||||
default="https://api.minimaxi.com",
|
||||
description="MiniMax Base URL(国内: api.minimaxi.com, 国际: api.minimax.io)",
|
||||
)
|
||||
|
||||
# Vidu 配置
|
||||
VIDU_API_KEY: str | None = Field(default=None, description="Vidu API Key")
|
||||
VIDU_BASE_URL: str = Field(
|
||||
default="https://api.vidu.cn",
|
||||
description="Vidu Base URL",
|
||||
)
|
||||
|
||||
# 七牛云存储配置
|
||||
QINIU_ACCESS_KEY: str | None = Field(default=None, description="七牛云 Access Key")
|
||||
QINIU_SECRET_KEY: str | None = Field(default=None, description="七牛云 Secret Key")
|
||||
|
||||
@@ -0,0 +1,241 @@
|
||||
"""
|
||||
Vidu TTS 服务封装
|
||||
=================
|
||||
|
||||
业务层封装:
|
||||
- 同步 TTS
|
||||
- 声音复刻
|
||||
- 对口型(异步,需轮询)
|
||||
- 预设音色列表
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import Any
|
||||
|
||||
from app.ai.providers.vidu_provider import ViduProvider
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Vidu 预设音色(底层为 MiniMax,兼容 MiniMax 音色 ID)
|
||||
VIDU_PRESET_VOICES = [
|
||||
{
|
||||
"voice_id": "tianxin_xiaoling",
|
||||
"name": "甜心小玲",
|
||||
"language": "zh",
|
||||
"description": "甜美可爱,活泼俏皮",
|
||||
"recommended": True,
|
||||
"previewUrl": "https://media.liche.cn/meijiaka-zj/voice/tianxin_xiaoling.mp3",
|
||||
},
|
||||
{
|
||||
"voice_id": "danya_xuejie",
|
||||
"name": "淡雅学姐",
|
||||
"language": "zh",
|
||||
"description": "淡雅知性,温婉柔和",
|
||||
"recommended": False,
|
||||
"previewUrl": "https://media.liche.cn/meijiaka-zj/voice/danya_xuejie.mp3",
|
||||
},
|
||||
{
|
||||
"voice_id": "Chinese (Mandarin)_Warm_Girl",
|
||||
"name": "温暖少女",
|
||||
"language": "zh",
|
||||
"description": "温暖亲切,清新自然",
|
||||
"recommended": False,
|
||||
"previewUrl": "https://media.liche.cn/meijiaka-zj/voice/Warm_Girl.mp3",
|
||||
},
|
||||
{
|
||||
"voice_id": "Chinese (Mandarin)_Radio_Host",
|
||||
"name": "电台男主播",
|
||||
"language": "zh",
|
||||
"description": "专业播报,沉稳有力",
|
||||
"recommended": False,
|
||||
"previewUrl": "https://media.liche.cn/meijiaka-zj/voice/Radio_Host.mp3",
|
||||
},
|
||||
{
|
||||
"voice_id": "Chinese (Mandarin)_Straightforward_Boy",
|
||||
"name": "率真弟弟",
|
||||
"language": "zh",
|
||||
"description": "率真爽朗,青春阳光",
|
||||
"recommended": False,
|
||||
"previewUrl": "https://media.liche.cn/meijiaka-zj/voice/Straightforward_Boy.mp3",
|
||||
},
|
||||
{
|
||||
"voice_id": "Chinese (Mandarin)_Gentleman",
|
||||
"name": "温润男声",
|
||||
"language": "zh",
|
||||
"description": "温润如玉,低沉磁性",
|
||||
"recommended": False,
|
||||
"previewUrl": "https://media.liche.cn/meijiaka-zj/voice/Gentleman.mp3",
|
||||
},
|
||||
]
|
||||
|
||||
DEFAULT_VOICE_ID = "tianxin_xiaoling"
|
||||
|
||||
|
||||
class ViduTTSService:
|
||||
"""Vidu TTS 服务封装"""
|
||||
|
||||
def __init__(self):
|
||||
self.provider = ViduProvider()
|
||||
|
||||
# ==================== 预设音色 ====================
|
||||
|
||||
@staticmethod
|
||||
def get_preset_voices() -> list[dict]:
|
||||
"""获取预设音色列表"""
|
||||
return VIDU_PRESET_VOICES
|
||||
|
||||
@staticmethod
|
||||
def get_voice_by_id(voice_id: str) -> dict | None:
|
||||
"""根据 ID 获取音色信息"""
|
||||
for voice in VIDU_PRESET_VOICES:
|
||||
if voice["voice_id"] == voice_id:
|
||||
return voice
|
||||
return None
|
||||
|
||||
# ==================== 同步 TTS ====================
|
||||
|
||||
async def synthesize_sync(
|
||||
self,
|
||||
text: str,
|
||||
voice_id: str | None = None,
|
||||
speed: float = 1.0,
|
||||
volume: int = 0,
|
||||
pitch: int = 0,
|
||||
**kwargs,
|
||||
) -> str:
|
||||
"""
|
||||
同步语音合成,返回音频 URL。
|
||||
|
||||
Args:
|
||||
text: 待合成文本(≤10000 字符)
|
||||
voice_id: 音色 ID(默认:甜心小玲)
|
||||
speed: 语速(0.5-2.0)
|
||||
volume: 音量(0-10,0=正常)
|
||||
pitch: 语调(-12~12)
|
||||
|
||||
Returns:
|
||||
音频 URL
|
||||
"""
|
||||
if not text or not text.strip():
|
||||
raise ValueError("text 不能为空")
|
||||
|
||||
voice = voice_id or DEFAULT_VOICE_ID
|
||||
|
||||
result = await self.provider.tts_sync(
|
||||
text=text,
|
||||
voice_id=voice,
|
||||
speed=speed,
|
||||
volume=volume,
|
||||
pitch=pitch,
|
||||
**kwargs,
|
||||
)
|
||||
|
||||
audio_url = result.get("file_url")
|
||||
if not audio_url:
|
||||
raise ValueError("TTS 合成失败: 未返回音频 URL")
|
||||
|
||||
logger.info(f"[Vidu TTS] 合成成功: voice_id={voice}, url={audio_url[:60]}...")
|
||||
return audio_url
|
||||
|
||||
# ==================== 声音复刻 ====================
|
||||
|
||||
async def clone_voice(
|
||||
self,
|
||||
audio_url: str,
|
||||
voice_id: str,
|
||||
text: str | None = None,
|
||||
prompt_audio_url: str | None = None,
|
||||
prompt_text: str | None = None,
|
||||
) -> dict[str, Any]:
|
||||
"""
|
||||
声音复刻(同步接口)。
|
||||
|
||||
Args:
|
||||
audio_url: 原音频 URL
|
||||
voice_id: 自定义 voice_id(8~256字符,首字符字母)
|
||||
text: 试听文本(≤1000字符,不传则不会生成试听音频)
|
||||
prompt_audio_url: 示例音频 URL(<8秒)
|
||||
prompt_text: 示例音频对应文本
|
||||
|
||||
Returns:
|
||||
复刻结果 dict,包含 voice_id、demo_audio 等
|
||||
"""
|
||||
trial_text = text or "你好,欢迎使用vidu开放平台"
|
||||
|
||||
result = await self.provider.clone_voice(
|
||||
audio_url=audio_url,
|
||||
voice_id=voice_id,
|
||||
text=trial_text,
|
||||
prompt_audio_url=prompt_audio_url,
|
||||
prompt_text=prompt_text,
|
||||
)
|
||||
|
||||
logger.info(f"[Vidu Clone] 复刻成功: voice_id={result.get('voice_id')}")
|
||||
return result
|
||||
|
||||
async def query_clone_task(self, voice_id: str) -> dict[str, Any]:
|
||||
"""
|
||||
Vidu 声音复刻是同步接口,无独立查询。
|
||||
此方法仅做兼容,返回已知的 voice_id 信息。
|
||||
"""
|
||||
return {"voice_id": voice_id, "status": "succeeded"}
|
||||
|
||||
# ==================== 对口型 ====================
|
||||
|
||||
async def lip_sync_create(
|
||||
self,
|
||||
video_url: str,
|
||||
audio_url: str | None = None,
|
||||
text: str | None = None,
|
||||
voice_id: str | None = None,
|
||||
speed: float = 1.0,
|
||||
volume: int = 0,
|
||||
ref_photo_url: str | None = None,
|
||||
callback_url: str | None = None,
|
||||
) -> str:
|
||||
"""
|
||||
创建对口型任务(异步接口),返回 task_id。
|
||||
|
||||
Args:
|
||||
video_url: 原视频 URL
|
||||
audio_url: 音频 URL(与 text 二选一)
|
||||
text: 文本内容(与 audio_url 二选一)
|
||||
voice_id: 音色 ID(文字驱动时生效)
|
||||
speed: 语速(文字驱动时生效)
|
||||
volume: 音量(文字驱动时生效)
|
||||
ref_photo_url: 人脸参考图 URL
|
||||
callback_url: 回调地址
|
||||
|
||||
Returns:
|
||||
task_id
|
||||
"""
|
||||
result = await self.provider.lip_sync(
|
||||
video_url=video_url,
|
||||
audio_url=audio_url,
|
||||
text=text,
|
||||
voice_id=voice_id,
|
||||
speed=speed,
|
||||
volume=volume,
|
||||
ref_photo_url=ref_photo_url,
|
||||
callback_url=callback_url,
|
||||
)
|
||||
|
||||
task_id = result.get("task_id")
|
||||
if not task_id:
|
||||
raise ValueError("对口型任务创建失败: 未返回 task_id")
|
||||
|
||||
logger.info(f"[Vidu LipSync] 任务创建成功: task_id={task_id}")
|
||||
return task_id
|
||||
|
||||
async def lip_sync_query(self, task_id: str) -> dict[str, Any]:
|
||||
"""
|
||||
查询对口型任务状态及生成物。
|
||||
|
||||
Returns:
|
||||
任务状态 dict,包含 state、creations 等
|
||||
"""
|
||||
result = await self.provider.query_task(task_id)
|
||||
logger.info(f"[Vidu LipSync] 查询状态: task_id={task_id}, state={result.get('state')}")
|
||||
return result
|
||||
@@ -12,6 +12,14 @@ services:
|
||||
- REDIS_PORT=6379
|
||||
- REDIS_DB=1
|
||||
- SECRET_KEY=dev-secret-key-change-in-production
|
||||
- MINIMAX_API_KEY=${MINIMAX_API_KEY}
|
||||
- MINIMAX_BASE_URL=${MINIMAX_BASE_URL:-https://api.minimaxi.com}
|
||||
- VIDU_API_KEY=${VIDU_API_KEY}
|
||||
- VIDU_BASE_URL=${VIDU_BASE_URL:-https://api.vidu.cn}
|
||||
- MINIMAX_API_KEY=${MINIMAX_API_KEY}
|
||||
- MINIMAX_BASE_URL=${MINIMAX_BASE_URL:-https://api.minimaxi.com}
|
||||
- VIDU_API_KEY=${VIDU_API_KEY}
|
||||
- VIDU_BASE_URL=${VIDU_BASE_URL:-https://api.vidu.cn}
|
||||
volumes:
|
||||
- .:/app
|
||||
- ~/Documents/Meijiaka-zj:/root/Documents/Meijiaka-zj
|
||||
|
||||
@@ -1,245 +1,306 @@
|
||||
/**
|
||||
* VoiceDubbing 样式
|
||||
* ==================
|
||||
*/
|
||||
/* 语音配音页面 — 遵循项目样式规范 */
|
||||
|
||||
.voice-dubbing {
|
||||
width: 100%;
|
||||
height: 100%;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
}
|
||||
|
||||
/* 左右分栏 */
|
||||
.dubbing-layout {
|
||||
display: grid;
|
||||
grid-template-columns: 1fr 1fr;
|
||||
gap: var(--spacing-lg);
|
||||
margin-top: var(--spacing-md);
|
||||
gap: var(--spacing-xl);
|
||||
flex: 1;
|
||||
min-height: 0;
|
||||
}
|
||||
|
||||
.voice-panel,
|
||||
.mapping-panel {
|
||||
/* ========== 左侧 ========== */
|
||||
|
||||
.voice-sidebar {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: var(--spacing-md);
|
||||
gap: var(--spacing-lg);
|
||||
min-height: 0;
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
.panel-section {
|
||||
background: var(--bg-card);
|
||||
border: 1px solid var(--border-light);
|
||||
border-radius: var(--radius-lg);
|
||||
padding: var(--spacing-md);
|
||||
.voice-sidebar > .voice-section:first-child {
|
||||
flex: 1;
|
||||
min-height: 0;
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
.panel-section h4 {
|
||||
font-size: 13px;
|
||||
.voice-list {
|
||||
flex: 1;
|
||||
min-height: 0;
|
||||
overflow-y: auto;
|
||||
}
|
||||
|
||||
.voice-section {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: var(--spacing-sm);
|
||||
}
|
||||
|
||||
.voice-section-header {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
}
|
||||
|
||||
.voice-section-title {
|
||||
font-size: var(--font-sm);
|
||||
font-weight: 600;
|
||||
color: var(--text-primary);
|
||||
margin-bottom: var(--spacing-sm);
|
||||
}
|
||||
|
||||
/* 音色网格 */
|
||||
.voice-grid {
|
||||
display: grid;
|
||||
grid-template-columns: 1fr 1fr;
|
||||
.link-btn {
|
||||
font-size: var(--font-sm);
|
||||
color: var(--primary);
|
||||
background: none;
|
||||
border: none;
|
||||
cursor: pointer;
|
||||
padding: 0;
|
||||
}
|
||||
|
||||
.link-btn:hover {
|
||||
text-decoration: underline;
|
||||
}
|
||||
|
||||
/* Tab — 遵循项目选项卡风格 */
|
||||
.voice-tabs {
|
||||
display: flex;
|
||||
gap: 0;
|
||||
border-bottom: 1px solid var(--border-light);
|
||||
}
|
||||
|
||||
.voice-tab {
|
||||
padding: 6px 12px;
|
||||
border: none;
|
||||
border-bottom: 2px solid transparent;
|
||||
background: none;
|
||||
color: var(--text-secondary);
|
||||
font-size: var(--font-sm);
|
||||
cursor: pointer;
|
||||
transition: all var(--transition-fast);
|
||||
}
|
||||
|
||||
.voice-tab:hover {
|
||||
color: var(--primary);
|
||||
}
|
||||
|
||||
.voice-tab.active {
|
||||
border-bottom-color: var(--primary);
|
||||
color: var(--primary);
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
/* 试听条 */
|
||||
.voice-preview-bar {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: var(--spacing-sm);
|
||||
padding: var(--spacing-sm);
|
||||
background: var(--primary-light);
|
||||
border-radius: var(--radius-md);
|
||||
}
|
||||
|
||||
.voice-preview-audio {
|
||||
flex: 1;
|
||||
height: 28px;
|
||||
}
|
||||
|
||||
/* 音色列表 — 遵循 .option-card 规范 */
|
||||
.voice-list {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: var(--spacing-xs);
|
||||
}
|
||||
|
||||
.voice-card {
|
||||
border: 1px solid var(--border-light);
|
||||
.voice-row {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
padding: var(--spacing-sm) var(--spacing-md);
|
||||
border-radius: var(--radius-md);
|
||||
padding: var(--spacing-sm);
|
||||
border: 1px solid var(--border-color);
|
||||
background: var(--bg-card);
|
||||
cursor: pointer;
|
||||
transition: all 0.15s ease;
|
||||
background: var(--bg-primary);
|
||||
transition: all var(--transition-fast);
|
||||
}
|
||||
|
||||
.voice-card:hover {
|
||||
border-color: var(--primary-light);
|
||||
.voice-row:hover {
|
||||
border-color: var(--primary);
|
||||
background: var(--bg-hover);
|
||||
}
|
||||
|
||||
.voice-card.selected {
|
||||
.voice-row.selected {
|
||||
border-color: var(--primary);
|
||||
background: color-mix(in srgb, var(--primary) 5%, var(--bg-card));
|
||||
background: var(--primary-light);
|
||||
}
|
||||
|
||||
.voice-name {
|
||||
font-size: 13px;
|
||||
font-weight: 600;
|
||||
.voice-row-main {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: space-between;
|
||||
width: 100%;
|
||||
}
|
||||
|
||||
.voice-row-info {
|
||||
flex: 1;
|
||||
min-width: 0;
|
||||
}
|
||||
|
||||
.voice-row-name {
|
||||
font-size: var(--font-sm);
|
||||
font-weight: 500;
|
||||
color: var(--text-primary);
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 6px;
|
||||
}
|
||||
|
||||
.recommended-tag {
|
||||
font-size: 10px;
|
||||
background: color-mix(in srgb, var(--primary) 15%, transparent);
|
||||
color: var(--primary);
|
||||
padding: 1px 5px;
|
||||
border-radius: var(--radius-sm);
|
||||
font-weight: 500;
|
||||
}
|
||||
|
||||
.voice-desc {
|
||||
font-size: 11px;
|
||||
.voice-row-desc {
|
||||
font-size: var(--font-xs);
|
||||
color: var(--text-secondary);
|
||||
margin-top: 2px;
|
||||
}
|
||||
|
||||
/* 试听 */
|
||||
.preview-row {
|
||||
display: flex;
|
||||
gap: var(--spacing-sm);
|
||||
align-items: flex-end;
|
||||
}
|
||||
|
||||
.preview-text {
|
||||
flex: 1;
|
||||
padding: var(--spacing-sm);
|
||||
border: 1px solid var(--border-light);
|
||||
border-radius: var(--radius-md);
|
||||
font-size: 13px;
|
||||
resize: none;
|
||||
line-height: 1.5;
|
||||
font-family: inherit;
|
||||
}
|
||||
|
||||
.preview-audio {
|
||||
width: 100%;
|
||||
height: 36px;
|
||||
margin-top: var(--spacing-sm);
|
||||
}
|
||||
|
||||
/* 批量合成 */
|
||||
.batch-info {
|
||||
display: flex;
|
||||
gap: var(--spacing-md);
|
||||
font-size: 12px;
|
||||
.voice-row-desc-inline {
|
||||
font-size: var(--font-xs);
|
||||
color: var(--text-secondary);
|
||||
margin-bottom: var(--spacing-sm);
|
||||
flex-wrap: wrap;
|
||||
margin-left: 8px;
|
||||
font-weight: 400;
|
||||
}
|
||||
|
||||
.batch-btn {
|
||||
width: 100%;
|
||||
/* 标签 — 遵循全局 .tag 风格,不覆盖 */
|
||||
.voice-row-name .tag {
|
||||
font-size: var(--font-xs);
|
||||
padding: 1px 5px;
|
||||
}
|
||||
|
||||
.progress-bar {
|
||||
height: 4px;
|
||||
background: var(--bg-light);
|
||||
border-radius: 2px;
|
||||
overflow: hidden;
|
||||
margin-top: var(--spacing-sm);
|
||||
/* 试听按钮 — 图标按钮风格 */
|
||||
.preview-icon {
|
||||
width: 32px;
|
||||
height: 32px;
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
border: none;
|
||||
border-radius: var(--radius-md);
|
||||
background: var(--bg-input);
|
||||
color: var(--text-secondary);
|
||||
font-size: var(--font-xs);
|
||||
cursor: pointer;
|
||||
flex-shrink: 0;
|
||||
transition: all var(--transition-fast);
|
||||
}
|
||||
|
||||
.progress-fill {
|
||||
height: 100%;
|
||||
.preview-icon:hover {
|
||||
background: var(--primary);
|
||||
transition: width 0.3s ease;
|
||||
color: var(--text-inverse);
|
||||
}
|
||||
|
||||
/* 分镜配音列表 */
|
||||
.segment-voice-list {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: var(--spacing-xs);
|
||||
max-height: 400px;
|
||||
overflow-y: auto;
|
||||
}
|
||||
|
||||
.seg-voice-item {
|
||||
border: 1px solid var(--border-light);
|
||||
border-radius: var(--radius-md);
|
||||
padding: var(--spacing-sm);
|
||||
background: var(--bg-primary);
|
||||
}
|
||||
|
||||
.seg-voice-item.empty-shot {
|
||||
.preview-icon:disabled {
|
||||
opacity: 0.5;
|
||||
cursor: not-allowed;
|
||||
}
|
||||
|
||||
.seg-voice-info {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 4px;
|
||||
/* 空状态 */
|
||||
.voice-empty {
|
||||
padding: var(--spacing-xl);
|
||||
text-align: center;
|
||||
color: var(--text-secondary);
|
||||
font-size: var(--font-sm);
|
||||
}
|
||||
|
||||
.seg-voice-index {
|
||||
font-size: 12px;
|
||||
.voice-empty small {
|
||||
font-size: var(--font-xs);
|
||||
opacity: 0.7;
|
||||
}
|
||||
|
||||
/* 语速 */
|
||||
.speed-value {
|
||||
font-size: var(--font-sm);
|
||||
color: var(--primary);
|
||||
font-weight: 600;
|
||||
color: var(--text-primary);
|
||||
}
|
||||
|
||||
.seg-has-audio {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 4px;
|
||||
}
|
||||
|
||||
.audio-name {
|
||||
font-size: 11px;
|
||||
color: var(--success);
|
||||
}
|
||||
|
||||
.seg-audio-player {
|
||||
height: 28px;
|
||||
width: 100%;
|
||||
}
|
||||
|
||||
.seg-no-audio {
|
||||
font-size: 11px;
|
||||
.speed-value small {
|
||||
font-weight: 400;
|
||||
color: var(--text-secondary);
|
||||
margin-left: 4px;
|
||||
}
|
||||
|
||||
.seg-voiceover {
|
||||
font-size: 11px;
|
||||
color: var(--text-secondary);
|
||||
margin-top: 4px;
|
||||
line-height: 1.4;
|
||||
overflow: hidden;
|
||||
text-overflow: ellipsis;
|
||||
white-space: nowrap;
|
||||
}
|
||||
|
||||
/* 音频文件库 */
|
||||
.audio-file-list {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: var(--spacing-xs);
|
||||
}
|
||||
|
||||
.audio-file-item {
|
||||
.speed-slider-wrap {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: var(--spacing-sm);
|
||||
padding: var(--spacing-xs) 0;
|
||||
border-bottom: 1px solid var(--border-light);
|
||||
gap: var(--spacing-md);
|
||||
width: 100%;
|
||||
}
|
||||
|
||||
.audio-file-item:last-child {
|
||||
border-bottom: none;
|
||||
}
|
||||
|
||||
.audio-file-info {
|
||||
flex: 1;
|
||||
min-width: 0;
|
||||
}
|
||||
|
||||
.audio-file-name {
|
||||
font-size: 12px;
|
||||
font-weight: 500;
|
||||
color: var(--text-primary);
|
||||
display: block;
|
||||
overflow: hidden;
|
||||
text-overflow: ellipsis;
|
||||
.speed-slider-wrap span {
|
||||
font-size: var(--font-xs);
|
||||
color: var(--text-tertiary);
|
||||
white-space: nowrap;
|
||||
flex-shrink: 0;
|
||||
min-width: 36px;
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
.audio-file-size {
|
||||
font-size: 11px;
|
||||
.speed-slider-wrap .slider-input {
|
||||
flex: 1;
|
||||
}
|
||||
|
||||
/* 底部生成按钮 — 复用全局 .btn-primary,只做宽度调整 */
|
||||
.voice-generate-wrap {
|
||||
margin-top: auto;
|
||||
padding-top: var(--spacing-md);
|
||||
}
|
||||
|
||||
.voice-generate-wrap .btn {
|
||||
width: 100%;
|
||||
}
|
||||
|
||||
/* ========== 右侧 ========== */
|
||||
|
||||
.script-content {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
min-height: 0;
|
||||
}
|
||||
|
||||
.script-content-header {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: space-between;
|
||||
font-size: var(--font-sm);
|
||||
font-weight: 600;
|
||||
color: var(--text-primary);
|
||||
margin-bottom: var(--spacing-sm);
|
||||
}
|
||||
|
||||
.script-content-meta {
|
||||
font-size: var(--font-xs);
|
||||
font-weight: 400;
|
||||
color: var(--text-secondary);
|
||||
}
|
||||
|
||||
.audio-file-player {
|
||||
height: 28px;
|
||||
flex-shrink: 0;
|
||||
/* textarea 撑满剩余空间 */
|
||||
.script-content textarea {
|
||||
flex: 1;
|
||||
min-height: 0;
|
||||
line-height: 1.8;
|
||||
}
|
||||
|
||||
/* 内嵌试听播放器 */
|
||||
.voice-preview-inline {
|
||||
margin-top: var(--spacing-sm);
|
||||
padding-top: var(--spacing-sm);
|
||||
border-top: 1px solid var(--border-light);
|
||||
}
|
||||
|
||||
.voice-preview-inline .voice-preview-audio {
|
||||
width: 100%;
|
||||
}
|
||||
|
||||
@@ -1,314 +1,288 @@
|
||||
/**
|
||||
* 配音管理页面
|
||||
* =============
|
||||
* 语音配音页面 (Step 3)
|
||||
* ======================
|
||||
*
|
||||
* TTS 文本转语音:选择音色、批量合成旁白配音。
|
||||
* 管理项目音频文件,关联到分镜。
|
||||
* 布局:左侧窄栏(音色 + 语速 + 生成按钮固定底部)| 右侧宽栏(配音文案)
|
||||
*/
|
||||
|
||||
import { useState, useEffect, useCallback, useRef } from 'react';
|
||||
import { useState, useEffect, useMemo, useCallback } from 'react';
|
||||
import { useProjectStore } from '../../store';
|
||||
import { useVoiceStore } from '../../store/voiceStore';
|
||||
import { getCurrentProjectId } from '../../api/modules/localStorage';
|
||||
import { synthesizeTTS, synthesizeBatchTTS } from '../../api/modules/voice';
|
||||
import { saveAudio } from '../../api/modules/voice';
|
||||
import { synthesizeTTS, saveAudio, uploadAudio } from '../../api/modules/voice';
|
||||
import { toast } from '../../store/uiStore';
|
||||
import { useProgressStore } from '../../store/progressStore';
|
||||
import './VoiceDubbing.css';
|
||||
|
||||
export default function VoiceDubbing() {
|
||||
const projectId = getCurrentProjectId();
|
||||
const segments = useProjectStore(state => state.segments);
|
||||
const updateSegment = useProjectStore(state => state.updateSegment);
|
||||
const projectId = getCurrentProjectId();
|
||||
|
||||
const {
|
||||
presetVoices,
|
||||
voiceMaterials,
|
||||
selectedVoiceId,
|
||||
speed,
|
||||
volume,
|
||||
pitch,
|
||||
loadPresetVoices,
|
||||
loadVoiceMaterials,
|
||||
setSelectedVoiceId,
|
||||
projectAudios,
|
||||
setSpeed,
|
||||
setVolume,
|
||||
setPitch,
|
||||
loadProjectAudios,
|
||||
getAudioForSegment,
|
||||
setAudioMapping,
|
||||
} = useVoiceStore();
|
||||
|
||||
const [isSynthesizing, setIsSynthesizing] = useState(false);
|
||||
const [synthProgress, setSynthProgress] = useState(0);
|
||||
const [synthTotal, setSynthTotal] = useState(0);
|
||||
const [customText, setCustomText] = useState('');
|
||||
const [customPreviewUrl, setCustomPreviewUrl] = useState<string | null>(null);
|
||||
const audioPreviewRef = useRef<HTMLAudioElement>(null);
|
||||
const [isGenerating, setIsGenerating] = useState(false);
|
||||
const [activeVoiceTab, setActiveVoiceTab] = useState<'preset' | 'clone'>('preset');
|
||||
const [activePreviewVoiceId, setActivePreviewVoiceId] = useState<string | null>(null);
|
||||
|
||||
// 加载音色和项目音频
|
||||
useEffect(() => {
|
||||
loadPresetVoices();
|
||||
if (projectId) {
|
||||
loadProjectAudios(projectId);
|
||||
}
|
||||
loadVoiceMaterials();
|
||||
if (projectId) loadProjectAudios(projectId);
|
||||
}, [projectId]);
|
||||
|
||||
// 获取有旁白文本的分镜(排除空镜)
|
||||
const voicedSegments = segments.filter(s => s.type !== 'empty_shot' && s.voiceover);
|
||||
const totalChars = voicedSegments.reduce((sum, s) => sum + (s.voiceover?.length || 0), 0);
|
||||
const mergedText = useMemo(
|
||||
() => segments.map(s => s.voiceover?.trim() || '【空镜】').join('\n'),
|
||||
[segments]
|
||||
);
|
||||
const totalChars = mergedText.length;
|
||||
|
||||
// 批量合成所有旁白
|
||||
const handleBatchSynthesize = useCallback(async () => {
|
||||
if (!projectId || voicedSegments.length === 0) {
|
||||
toast.warn('没有需要合成的旁白');
|
||||
const handleTogglePreview = useCallback((voiceId: string, voiceName: string, e: React.MouseEvent) => {
|
||||
e.stopPropagation();
|
||||
// 点击同一个就是关闭
|
||||
if (activePreviewVoiceId === voiceId) {
|
||||
setActivePreviewVoiceId(null);
|
||||
return;
|
||||
}
|
||||
setActivePreviewVoiceId(voiceId);
|
||||
}, [activePreviewVoiceId]);
|
||||
|
||||
setIsSynthesizing(true);
|
||||
setSynthProgress(0);
|
||||
setSynthTotal(voicedSegments.length);
|
||||
|
||||
let successCount = 0;
|
||||
let failCount = 0;
|
||||
|
||||
try {
|
||||
for (let i = 0; i < voicedSegments.length; i++) {
|
||||
const seg = voicedSegments[i];
|
||||
const segId = seg.id?.toString() || String(i);
|
||||
const text = seg.voiceover || '';
|
||||
|
||||
setSynthProgress(i + 1);
|
||||
|
||||
try {
|
||||
// 同步 TTS 合成(≤200字)
|
||||
const result = await synthesizeTTS({
|
||||
text,
|
||||
voiceId: selectedVoiceId,
|
||||
speed: 1.0,
|
||||
});
|
||||
|
||||
if (!result.audioBase64) {
|
||||
throw new Error('未返回音频数据');
|
||||
}
|
||||
|
||||
// 保存到本地
|
||||
const audioId = `tts_${segId}_${Date.now()}`;
|
||||
const meta = await saveAudio({
|
||||
projectId,
|
||||
audioId,
|
||||
audioData: result.audioBase64,
|
||||
name: `旁白-${segId}`,
|
||||
voiceId: selectedVoiceId,
|
||||
duration: 0, // 暂时无法获取时长
|
||||
segmentId: segId,
|
||||
});
|
||||
|
||||
// 关联到分镜
|
||||
setAudioMapping(segId, meta.id);
|
||||
|
||||
// 更新分镜 audioPath
|
||||
updateSegment(seg.id!, { audioPath: meta.filePath });
|
||||
successCount++;
|
||||
} catch (err) {
|
||||
console.error(`[VoiceDubbing] 分镜 ${segId} 合成失败:`, err);
|
||||
failCount++;
|
||||
}
|
||||
}
|
||||
|
||||
if (successCount > 0) {
|
||||
toast.success(`配音合成完成:成功 ${successCount} 段${failCount > 0 ? `,失败 ${failCount} 段` : ''}`);
|
||||
} else {
|
||||
toast.error('配音合成全部失败');
|
||||
}
|
||||
} finally {
|
||||
setIsSynthesizing(false);
|
||||
setSynthProgress(0);
|
||||
}
|
||||
}, [projectId, voicedSegments, selectedVoiceId, updateSegment, setAudioMapping]);
|
||||
|
||||
// 试听音色
|
||||
const handlePreviewVoice = useCallback(async () => {
|
||||
if (!customText.trim()) {
|
||||
toast.warn('请输入要预览的文本');
|
||||
return;
|
||||
}
|
||||
|
||||
try {
|
||||
setCustomPreviewUrl(null);
|
||||
const result = await synthesizeTTS({
|
||||
text: customText.slice(0, 200),
|
||||
voiceId: selectedVoiceId,
|
||||
speed: 1.0,
|
||||
});
|
||||
|
||||
if (!result.audioBase64) {
|
||||
throw new Error('未返回音频数据');
|
||||
}
|
||||
|
||||
const audioBlob = new Blob(
|
||||
[Uint8Array.from(atob(result.audioBase64), c => c.charCodeAt(0))],
|
||||
{ type: 'audio/mp3' }
|
||||
);
|
||||
const url = URL.createObjectURL(audioBlob);
|
||||
setCustomPreviewUrl(url);
|
||||
} catch (err) {
|
||||
toast.error(`试听失败: ${err instanceof Error ? err.message : String(err)}`);
|
||||
}
|
||||
}, [customText, selectedVoiceId]);
|
||||
|
||||
// 将项目音频关联到分镜
|
||||
const handleAssignToSegment = (audioId: string, segmentId: string) => {
|
||||
setAudioMapping(segmentId, audioId);
|
||||
|
||||
// 同时更新分镜的 audioPath
|
||||
const audio = projectAudios.find(a => a.id === audioId);
|
||||
if (audio) {
|
||||
updateSegment(parseInt(segmentId), { audioPath: audio.filePath });
|
||||
}
|
||||
toast.success('已关联到分镜');
|
||||
const getPreviewUrl = (voiceId: string): string | null => {
|
||||
const voice = presetVoices.find(v => v.voiceId === voiceId);
|
||||
return voice?.previewUrl || null;
|
||||
};
|
||||
|
||||
const selectedVoice = presetVoices.find(v => v.voiceId === selectedVoiceId);
|
||||
const handleGenerate = useCallback(async () => {
|
||||
if (!projectId) { toast.warning('请先创建项目'); return; }
|
||||
const realText = segments.map(s => s.voiceover?.trim()).filter(Boolean).join('\n');
|
||||
if (!realText) { toast.warning('没有需要合成的旁白文本'); return; }
|
||||
// Kling TTS 限制单次 ≤1000 字,超长自动截断
|
||||
const truncatedText = realText.length > 1000 ? realText.slice(0, 1000) : realText;
|
||||
|
||||
const progress = useProgressStore.getState();
|
||||
setIsGenerating(true);
|
||||
progress.show('生成配音');
|
||||
|
||||
try {
|
||||
progress.update('正在合成语音...');
|
||||
const result = await synthesizeTTS({ text: truncatedText, voiceId: selectedVoiceId, speed, volume, pitch });
|
||||
if (!result.audioUrl) throw new Error('未返回音频 URL');
|
||||
|
||||
progress.update('正在保存音频...');
|
||||
// 下载音频 blob
|
||||
const response = await fetch(result.audioUrl);
|
||||
if (!response.ok) throw new Error('下载音频失败');
|
||||
const blob = await response.blob();
|
||||
|
||||
// 上传七牛云
|
||||
const file = new File([blob], `tts_${Date.now()}.mp3`, { type: 'audio/mp3' });
|
||||
const qiniuUrl = await uploadAudio(file);
|
||||
|
||||
// 本地保存
|
||||
const base64 = await new Promise<string>((resolve, reject) => {
|
||||
const reader = new FileReader();
|
||||
reader.onloadend = () => {
|
||||
const dataUrl = reader.result as string;
|
||||
resolve(dataUrl.split(',')[1]);
|
||||
};
|
||||
reader.onerror = reject;
|
||||
reader.readAsDataURL(blob);
|
||||
});
|
||||
|
||||
const audioId = `voice_${Date.now()}`;
|
||||
const meta = await saveAudio({
|
||||
projectId, audioId, audioData: base64,
|
||||
name: `配音-${segments.length}段`, voiceId: selectedVoiceId, duration: 0,
|
||||
});
|
||||
|
||||
for (const seg of segments) {
|
||||
const segId = seg.id;
|
||||
if (segId) {
|
||||
setAudioMapping(segId.toString(), meta.id);
|
||||
updateSegment(segId, { audioPath: meta.filePath, audioUrl: qiniuUrl });
|
||||
}
|
||||
}
|
||||
progress.success('配音生成完成');
|
||||
} catch (err) {
|
||||
progress.error(err instanceof Error ? err.message : '生成失败');
|
||||
} finally {
|
||||
setIsGenerating(false);
|
||||
}
|
||||
}, [projectId, segments, selectedVoiceId, speed, volume, pitch, setAudioMapping, updateSegment]);
|
||||
|
||||
return (
|
||||
<div className="voice-dubbing">
|
||||
<div className="step-header">
|
||||
<h2>配音管理</h2>
|
||||
<p className="step-desc">
|
||||
{voicedSegments.length} 个分镜待配音,共 {totalChars} 字
|
||||
</p>
|
||||
</div>
|
||||
|
||||
<div className="dubbing-layout">
|
||||
{/* 左侧:音色选择 + 批量合成 */}
|
||||
<div className="voice-panel">
|
||||
{/* 左侧:音色 + 语速 + 生成按钮 */}
|
||||
<div className="voice-sidebar">
|
||||
{/* 音色选择 */}
|
||||
<div className="panel-section">
|
||||
<h4>选择音色</h4>
|
||||
<div className="voice-grid">
|
||||
{presetVoices.map(voice => (
|
||||
<div
|
||||
key={voice.voiceId}
|
||||
className={`voice-card ${voice.voiceId === selectedVoiceId ? 'selected' : ''}`}
|
||||
onClick={() => setSelectedVoiceId(voice.voiceId)}
|
||||
>
|
||||
<div className="voice-name">
|
||||
{voice.name}
|
||||
{voice.recommended && <span className="recommended-tag">推荐</span>}
|
||||
</div>
|
||||
<div className="voice-desc">{voice.description}</div>
|
||||
</div>
|
||||
))}
|
||||
<div className="voice-section">
|
||||
<div className="voice-section-header">
|
||||
<span className="voice-section-title">选择音色</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{/* 音色试听 */}
|
||||
<div className="panel-section">
|
||||
<h4>试听音色</h4>
|
||||
<div className="preview-row">
|
||||
<textarea
|
||||
className="preview-text"
|
||||
value={customText}
|
||||
onChange={e => setCustomText(e.target.value)}
|
||||
placeholder="输入文本试听音色(≤200字)..."
|
||||
rows={3}
|
||||
maxLength={200}
|
||||
/>
|
||||
<button
|
||||
className="btn btn-secondary"
|
||||
onClick={handlePreviewVoice}
|
||||
disabled={!customText.trim()}
|
||||
>
|
||||
试听
|
||||
<div className="voice-tabs">
|
||||
<button className={`voice-tab ${activeVoiceTab === 'preset' ? 'active' : ''}`} onClick={() => setActiveVoiceTab('preset')}>
|
||||
系统预设 ({presetVoices.length})
|
||||
</button>
|
||||
<button className={`voice-tab ${activeVoiceTab === 'clone' ? 'active' : ''}`} onClick={() => setActiveVoiceTab('clone')}>
|
||||
私有音色 ({voiceMaterials.filter(m => m.status === 'ready').length})
|
||||
</button>
|
||||
</div>
|
||||
{customPreviewUrl && (
|
||||
<audio ref={audioPreviewRef} src={customPreviewUrl} controls className="preview-audio" />
|
||||
)}
|
||||
</div>
|
||||
|
||||
{/* 批量合成 */}
|
||||
<div className="panel-section">
|
||||
<h4>批量配音</h4>
|
||||
<div className="batch-info">
|
||||
<span>音色:{selectedVoice?.name}</span>
|
||||
<span>分镜:{voicedSegments.length} 个</span>
|
||||
<span>字数:约 {totalChars} 字</span>
|
||||
</div>
|
||||
<button
|
||||
className="btn btn-primary batch-btn"
|
||||
onClick={handleBatchSynthesize}
|
||||
disabled={isSynthesizing || voicedSegments.length === 0}
|
||||
>
|
||||
{isSynthesizing
|
||||
? `合成中... ${synthProgress}/${synthTotal}`
|
||||
: `为 ${voicedSegments.length} 个分镜生成配音`}
|
||||
</button>
|
||||
{isSynthesizing && (
|
||||
<div className="progress-bar">
|
||||
<div
|
||||
className="progress-fill"
|
||||
style={{ width: `${(synthProgress / synthTotal) * 100}%` }}
|
||||
/>
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{/* 右侧:分镜-配音映射 */}
|
||||
<div className="mapping-panel">
|
||||
<div className="panel-section">
|
||||
<h4>分镜配音状态</h4>
|
||||
<div className="segment-voice-list">
|
||||
{segments.map((seg, i) => {
|
||||
const segId = seg.id?.toString() || String(i);
|
||||
const audio = getAudioForSegment(segId);
|
||||
const isEmptyShot = seg.type === 'empty_shot';
|
||||
|
||||
return (
|
||||
<div key={segId} className={`seg-voice-item ${isEmptyShot ? 'empty-shot' : ''}`}>
|
||||
<div className="seg-voice-info">
|
||||
<span className="seg-voice-index">
|
||||
{isEmptyShot ? '🎬' : '🎙️'} 镜头 {i + 1}
|
||||
</span>
|
||||
{audio ? (
|
||||
<div className="seg-has-audio">
|
||||
<span className="audio-name">{audio.name}</span>
|
||||
<audio
|
||||
src={`file://${audio.filePath}`}
|
||||
controls
|
||||
className="seg-audio-player"
|
||||
/>
|
||||
{activeVoiceTab === 'preset' && (
|
||||
<div className="voice-list">
|
||||
{presetVoices.map(v => (
|
||||
<div key={v.voiceId} className={`voice-row ${v.voiceId === selectedVoiceId ? 'selected' : ''}`} onClick={() => setSelectedVoiceId(v.voiceId)}>
|
||||
<div className="voice-row-main">
|
||||
<div className="voice-row-info">
|
||||
<div className="voice-row-name">
|
||||
{v.name}
|
||||
<span className="voice-row-desc-inline">{v.description}</span>
|
||||
</div>
|
||||
) : (
|
||||
<span className="seg-no-audio">
|
||||
{isEmptyShot ? '空镜无需配音' : '未配音'}
|
||||
</span>
|
||||
)}
|
||||
</div>
|
||||
<button className="preview-icon" onClick={e => handleTogglePreview(v.voiceId, v.name, e)}>
|
||||
{activePreviewVoiceId === v.voiceId ? '✕' : '▶'}
|
||||
</button>
|
||||
</div>
|
||||
<div className="seg-voiceover">{seg.voiceover || ''}</div>
|
||||
</div>
|
||||
);
|
||||
})}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{/* 音频文件列表 */}
|
||||
{projectAudios.length > 0 && (
|
||||
<div className="panel-section">
|
||||
<h4>音频文件库</h4>
|
||||
<div className="audio-file-list">
|
||||
{projectAudios.map(audio => (
|
||||
<div key={audio.id} className="audio-file-item">
|
||||
<div className="audio-file-info">
|
||||
<span className="audio-file-name">{audio.name}</span>
|
||||
<span className="audio-file-size">
|
||||
{(audio.fileSize / 1024).toFixed(1)} KB
|
||||
</span>
|
||||
</div>
|
||||
<audio
|
||||
src={`file://${audio.filePath}`}
|
||||
controls
|
||||
className="audio-file-player"
|
||||
/>
|
||||
{activePreviewVoiceId === v.voiceId && v.previewUrl && (
|
||||
<div className="voice-preview-inline">
|
||||
<audio src={v.previewUrl} controls className="voice-preview-audio" autoPlay />
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
))}
|
||||
</div>
|
||||
)}
|
||||
|
||||
{activeVoiceTab === 'clone' && (
|
||||
<div className="voice-list">
|
||||
{voiceMaterials.filter(m => m.status === 'ready').length === 0 ? (
|
||||
<div className="voice-empty">暂无私有音色<br /><small>去素材库上传音频并克隆音色</small></div>
|
||||
) : (
|
||||
voiceMaterials.filter(m => m.status === 'ready').map(m => (
|
||||
<div key={m.voiceId} className={`voice-row ${m.voiceId === selectedVoiceId ? 'selected' : ''}`} onClick={() => setSelectedVoiceId(m.voiceId)}>
|
||||
<div className="voice-row-main">
|
||||
<div className="voice-row-info">
|
||||
<div className="voice-row-name">
|
||||
{m.name} <span className="tag clone">克隆</span>
|
||||
<span className="voice-row-desc-inline">
|
||||
{m.createdAt ? new Date(m.createdAt).toLocaleDateString('zh-CN') : ''}
|
||||
</span>
|
||||
</div>
|
||||
</div>
|
||||
<button className="preview-icon" onClick={e => handleTogglePreview(m.voiceId, m.name, e)}>
|
||||
{activePreviewVoiceId === m.voiceId ? '✕' : '▶'}
|
||||
</button>
|
||||
</div>
|
||||
{activePreviewVoiceId === m.voiceId && m.trialUrl && (
|
||||
<div className="voice-preview-inline">
|
||||
<audio src={m.trialUrl} controls className="voice-preview-audio" autoPlay />
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
))
|
||||
)}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
|
||||
{/* 语速 */}
|
||||
<div className="voice-section">
|
||||
<div className="voice-section-header">
|
||||
<span className="voice-section-title">语速</span>
|
||||
<span className="speed-value">{speed.toFixed(1)}x</span>
|
||||
</div>
|
||||
)}
|
||||
<div className="speed-slider-wrap">
|
||||
<span>0.5x</span>
|
||||
<input
|
||||
type="range"
|
||||
className="slider-input"
|
||||
min={5}
|
||||
max={20}
|
||||
step={1}
|
||||
value={Math.round(speed * 10)}
|
||||
onChange={e => setSpeed(parseInt(e.target.value) / 10)}
|
||||
style={{ '--slider-percent': `${((Math.round(speed * 10) - 5) / 15) * 100}%` } as React.CSSProperties}
|
||||
/>
|
||||
<span>2.0x</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{/* 音量 */}
|
||||
<div className="voice-section">
|
||||
<div className="voice-section-header">
|
||||
<span className="voice-section-title">音量</span>
|
||||
<span className="speed-value">{volume}</span>
|
||||
</div>
|
||||
<div className="speed-slider-wrap">
|
||||
<span>0</span>
|
||||
<input
|
||||
type="range"
|
||||
className="slider-input"
|
||||
min={0}
|
||||
max={10}
|
||||
step={1}
|
||||
value={volume}
|
||||
onChange={e => setVolume(parseInt(e.target.value))}
|
||||
style={{ '--slider-percent': `${(volume / 10) * 100}%` } as React.CSSProperties}
|
||||
/>
|
||||
<span>10</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{/* 音调 */}
|
||||
<div className="voice-section">
|
||||
<div className="voice-section-header">
|
||||
<span className="voice-section-title">音调</span>
|
||||
<span className="speed-value">{pitch}</span>
|
||||
</div>
|
||||
<div className="speed-slider-wrap">
|
||||
<span>-12</span>
|
||||
<input
|
||||
type="range"
|
||||
className="slider-input"
|
||||
min={-12}
|
||||
max={12}
|
||||
step={1}
|
||||
value={pitch}
|
||||
onChange={e => setPitch(parseInt(e.target.value))}
|
||||
style={{ '--slider-percent': `${((pitch + 12) / 24) * 100}%` } as React.CSSProperties}
|
||||
/>
|
||||
<span>12</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{/* 底部生成按钮 */}
|
||||
<div className="voice-generate-wrap">
|
||||
<button className="btn btn-primary generate-btn" onClick={handleGenerate} disabled={isGenerating || !mergedText.trim()}>
|
||||
{isGenerating ? '合成中...' : '生成配音'}
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{/* 右侧:配音文案 */}
|
||||
<div className="script-content">
|
||||
<div className="script-content-header">
|
||||
配音文案
|
||||
<span className="script-content-meta">{totalChars} 字 · {segments.length} 个分镜</span>
|
||||
</div>
|
||||
<textarea readOnly value={mergedText} rows={20} className="script-textarea" />
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
@@ -7,7 +7,7 @@
|
||||
|
||||
import { create } from 'zustand';
|
||||
import { useShallow } from 'zustand/react/shallow';
|
||||
import type { VoiceInfo, AudioMeta } from '../api/modules/voice';
|
||||
import type { VoiceInfo, AudioMeta, VoiceMaterial, AvatarMaterial } from '../api/modules/voice';
|
||||
import * as voiceApi from '../api/modules/voice';
|
||||
|
||||
interface VoiceState {
|
||||
@@ -25,9 +25,26 @@ interface VoiceState {
|
||||
// 当前项目 ID
|
||||
currentProjectId: string | null;
|
||||
|
||||
// 语速
|
||||
speed: number;
|
||||
|
||||
// 音量 (0.5-10.0)
|
||||
volume: number;
|
||||
|
||||
// 音调 (-10 到 10)
|
||||
pitch: number;
|
||||
|
||||
// 加载状态
|
||||
isLoadingVoices: boolean;
|
||||
isLoadingAudios: boolean;
|
||||
|
||||
// 素材库(用户上传的克隆音色)
|
||||
voiceMaterials: VoiceMaterial[];
|
||||
isLoadingMaterials: boolean;
|
||||
|
||||
// 视频素材库
|
||||
avatarMaterials: AvatarMaterial[];
|
||||
isLoadingAvatarMaterials: boolean;
|
||||
}
|
||||
|
||||
interface VoiceActions {
|
||||
@@ -35,6 +52,28 @@ interface VoiceActions {
|
||||
loadPresetVoices: () => Promise<void>;
|
||||
setSelectedVoiceId: (id: string) => void;
|
||||
|
||||
// 语速
|
||||
setSpeed: (speed: number) => void;
|
||||
|
||||
// 音量
|
||||
setVolume: (volume: number) => void;
|
||||
|
||||
// 音调
|
||||
setPitch: (pitch: number) => void;
|
||||
|
||||
// 素材库操作
|
||||
loadVoiceMaterials: () => Promise<void>;
|
||||
addVoiceMaterial: (file: File, name: string) => Promise<VoiceMaterial>;
|
||||
updateVoiceMaterialStatus: (id: string, status: VoiceMaterial['status'], voiceId?: string, trialUrl?: string) => void;
|
||||
renameVoiceMaterial: (id: string, name: string) => Promise<void>;
|
||||
deleteVoiceMaterial: (materialId: string) => Promise<void>;
|
||||
|
||||
// 视频素材库操作
|
||||
loadAvatarMaterials: () => Promise<void>;
|
||||
addAvatarMaterial: (file: File, name: string) => Promise<AvatarMaterial>;
|
||||
renameAvatarMaterial: (id: string, name: string) => Promise<void>;
|
||||
deleteAvatarMaterial: (materialId: string) => Promise<void>;
|
||||
|
||||
// 项目音频操作
|
||||
loadProjectAudios: (projectId: string) => Promise<void>;
|
||||
saveAudio: (args: {
|
||||
@@ -58,12 +97,19 @@ interface VoiceActions {
|
||||
|
||||
const initialState: VoiceState = {
|
||||
presetVoices: [],
|
||||
selectedVoiceId: '829826751244537879', // 温柔女声(Kling 预设音色)
|
||||
selectedVoiceId: 'tianxin_xiaoling', // 甜心小玲
|
||||
projectAudios: [],
|
||||
audioMapping: {},
|
||||
currentProjectId: null,
|
||||
speed: 1.0,
|
||||
volume: 0,
|
||||
pitch: 0,
|
||||
isLoadingVoices: false,
|
||||
isLoadingAudios: false,
|
||||
voiceMaterials: [],
|
||||
isLoadingMaterials: false,
|
||||
avatarMaterials: [],
|
||||
isLoadingAvatarMaterials: false,
|
||||
};
|
||||
|
||||
export const useVoiceStore = create<VoiceState & VoiceActions>()(
|
||||
@@ -79,14 +125,57 @@ export const useVoiceStore = create<VoiceState & VoiceActions>()(
|
||||
set({ presetVoices: voices });
|
||||
} catch (err) {
|
||||
console.error('[VoiceStore] 加载音色列表失败:', err);
|
||||
// 静默失败,使用默认值(Kling 预设音色)
|
||||
// 静默失败,使用预设音色
|
||||
set({
|
||||
presetVoices: [
|
||||
{ voiceId: '829826751244537879', name: '温柔女声', description: '温柔细腻', recommended: true, language: 'zh' },
|
||||
{ voiceId: '829824295735410756', name: '钓系女友', description: '甜美撒娇', recommended: false, language: 'zh' },
|
||||
{ voiceId: '829826792415842333', name: '播报男声', description: '沉稳播报', recommended: false, language: 'zh' },
|
||||
{ voiceId: '829826834144964676', name: '盐系少年', description: '清新少年', recommended: false, language: 'zh' },
|
||||
{ voiceId: '829826884271091753', name: '撒娇女友', description: '可爱撒娇', recommended: false, language: 'zh' },
|
||||
{
|
||||
voiceId: 'tianxin_xiaoling',
|
||||
name: '甜心小玲',
|
||||
description: '甜美可爱,活泼俏皮',
|
||||
recommended: true,
|
||||
language: 'zh',
|
||||
previewUrl: 'https://media.liche.cn/meijiaka-zj/voice/tianxin_xiaoling.mp3',
|
||||
},
|
||||
{
|
||||
voiceId: 'danya_xuejie',
|
||||
name: '淡雅学姐',
|
||||
description: '淡雅知性,温婉柔和',
|
||||
recommended: false,
|
||||
language: 'zh',
|
||||
previewUrl: 'https://media.liche.cn/meijiaka-zj/voice/danya_xuejie.mp3',
|
||||
},
|
||||
{
|
||||
voiceId: 'Chinese (Mandarin)_Warm_Girl',
|
||||
name: '温暖少女',
|
||||
description: '温暖亲切,清新自然',
|
||||
recommended: false,
|
||||
language: 'zh',
|
||||
previewUrl: 'https://media.liche.cn/meijiaka-zj/voice/Warm_Girl.mp3',
|
||||
},
|
||||
{
|
||||
voiceId: 'Chinese (Mandarin)_Radio_Host',
|
||||
name: '电台男主播',
|
||||
description: '专业播报,沉稳有力',
|
||||
recommended: false,
|
||||
language: 'zh',
|
||||
previewUrl: 'https://media.liche.cn/meijiaka-zj/voice/Radio_Host.mp3',
|
||||
},
|
||||
{
|
||||
voiceId: 'Chinese (Mandarin)_Straightforward_Boy',
|
||||
name: '率真弟弟',
|
||||
description: '率真爽朗,青春阳光',
|
||||
recommended: false,
|
||||
language: 'zh',
|
||||
previewUrl: 'https://media.liche.cn/meijiaka-zj/voice/Straightforward_Boy.mp3',
|
||||
},
|
||||
{
|
||||
voiceId: 'Chinese (Mandarin)_Gentleman',
|
||||
name: '温润男声',
|
||||
description: '温润如玉,低沉磁性',
|
||||
recommended: false,
|
||||
language: 'zh',
|
||||
previewUrl: 'https://media.liche.cn/meijiaka-zj/voice/Gentleman.mp3',
|
||||
},
|
||||
],
|
||||
});
|
||||
} finally {
|
||||
@@ -96,6 +185,144 @@ export const useVoiceStore = create<VoiceState & VoiceActions>()(
|
||||
|
||||
setSelectedVoiceId: (id) => set({ selectedVoiceId: id }),
|
||||
|
||||
// ====================== 语速 ======================
|
||||
setSpeed: (speed: number) => set({ speed }),
|
||||
|
||||
// ====================== 音量 ======================
|
||||
setVolume: (volume: number) => set({ volume }),
|
||||
|
||||
// ====================== 音调 ======================
|
||||
setPitch: (pitch: number) => set({ pitch }),
|
||||
|
||||
// ====================== 素材库操作 ======================
|
||||
loadVoiceMaterials: async () => {
|
||||
set({ isLoadingMaterials: true });
|
||||
try {
|
||||
const materials = await voiceApi.loadVoiceMaterials();
|
||||
set({ voiceMaterials: materials });
|
||||
} catch (err) {
|
||||
console.error('[VoiceStore] 加载素材库失败:', err);
|
||||
} finally {
|
||||
set({ isLoadingMaterials: false });
|
||||
}
|
||||
},
|
||||
|
||||
addVoiceMaterial: async (file: File, name: string) => {
|
||||
// 1. 上传七牛云
|
||||
const sourceUrl = await voiceApi.uploadAudio(file);
|
||||
|
||||
// 2. 提交 Kling 克隆任务
|
||||
const cloneResult = await voiceApi.submitCloneTask({
|
||||
sourceAudioUrl: sourceUrl,
|
||||
voiceName: name,
|
||||
});
|
||||
|
||||
// 3. 创建本地记录
|
||||
const material: VoiceMaterial = {
|
||||
id: cloneResult.taskId,
|
||||
name,
|
||||
voiceId: '',
|
||||
sourceUrl,
|
||||
trialUrl: undefined,
|
||||
status: 'pending',
|
||||
createdAt: new Date().toISOString(),
|
||||
};
|
||||
|
||||
// 4. 保存到本地 JSON
|
||||
await voiceApi.saveVoiceMaterial(material);
|
||||
set(state => ({ voiceMaterials: [material, ...state.voiceMaterials] }));
|
||||
|
||||
return material;
|
||||
},
|
||||
|
||||
updateVoiceMaterialStatus: (id: string, status: VoiceMaterial['status'], voiceId?: string, trialUrl?: string) => {
|
||||
set(state => {
|
||||
const updated: VoiceMaterial[] = state.voiceMaterials.map((m): VoiceMaterial => {
|
||||
if (m.id !== id) return m;
|
||||
return {
|
||||
...m,
|
||||
status,
|
||||
voiceId: voiceId || m.voiceId,
|
||||
trialUrl: trialUrl || m.trialUrl,
|
||||
};
|
||||
});
|
||||
// 同步保存到本地
|
||||
const target = updated.find(m => m.id === id);
|
||||
if (target) {
|
||||
voiceApi.saveVoiceMaterial(target).catch(err => {
|
||||
console.error('[VoiceStore] 保存素材状态失败:', err);
|
||||
});
|
||||
}
|
||||
return { voiceMaterials: updated };
|
||||
});
|
||||
},
|
||||
|
||||
renameVoiceMaterial: async (id: string, name: string) => {
|
||||
set(state => {
|
||||
const updated = state.voiceMaterials.map(m => m.id === id ? { ...m, name } : m);
|
||||
const target = updated.find(m => m.id === id);
|
||||
if (target) {
|
||||
voiceApi.saveVoiceMaterial(target).catch(err => {
|
||||
console.error('[VoiceStore] 重命名素材失败:', err);
|
||||
});
|
||||
}
|
||||
return { voiceMaterials: updated };
|
||||
});
|
||||
},
|
||||
|
||||
deleteVoiceMaterial: async (materialId: string) => {
|
||||
await voiceApi.deleteVoiceMaterial(materialId);
|
||||
set(state => ({
|
||||
voiceMaterials: state.voiceMaterials.filter(m => m.id !== materialId),
|
||||
}));
|
||||
},
|
||||
|
||||
// ====================== 视频素材库操作 ======================
|
||||
loadAvatarMaterials: async () => {
|
||||
set({ isLoadingAvatarMaterials: true });
|
||||
try {
|
||||
const materials = await voiceApi.loadAvatarMaterials();
|
||||
set({ avatarMaterials: materials });
|
||||
} catch (err) {
|
||||
console.error('[VoiceStore] 加载视频素材失败:', err);
|
||||
} finally {
|
||||
set({ isLoadingAvatarMaterials: false });
|
||||
}
|
||||
},
|
||||
|
||||
addAvatarMaterial: async (file: File, name: string) => {
|
||||
const videoUrl = await voiceApi.uploadVideo(file);
|
||||
const material: AvatarMaterial = {
|
||||
id: `avatar_${Date.now()}`,
|
||||
name,
|
||||
videoUrl,
|
||||
createdAt: new Date().toISOString(),
|
||||
};
|
||||
await voiceApi.saveAvatarMaterial(material);
|
||||
set(state => ({ avatarMaterials: [material, ...state.avatarMaterials] }));
|
||||
return material;
|
||||
},
|
||||
|
||||
renameAvatarMaterial: async (id: string, name: string) => {
|
||||
set(state => {
|
||||
const updated = state.avatarMaterials.map(m => m.id === id ? { ...m, name } : m);
|
||||
const target = updated.find(m => m.id === id);
|
||||
if (target) {
|
||||
voiceApi.saveAvatarMaterial(target).catch(err => {
|
||||
console.error('[VoiceStore] 重命名素材失败:', err);
|
||||
});
|
||||
}
|
||||
return { avatarMaterials: updated };
|
||||
});
|
||||
},
|
||||
|
||||
deleteAvatarMaterial: async (materialId: string) => {
|
||||
await voiceApi.deleteAvatarMaterial(materialId);
|
||||
set(state => ({
|
||||
avatarMaterials: state.avatarMaterials.filter(m => m.id !== materialId),
|
||||
}));
|
||||
},
|
||||
|
||||
// ====================== 项目音频操作 ======================
|
||||
|
||||
loadProjectAudios: async (projectId) => {
|
||||
|
||||
Reference in New Issue
Block a user