feat: 素材库重构、七牛上传修复、配音页面优化、MiniMax后端接入
- 素材库: VoiceMaterialLibrary 支持音频/视频分类、Modal弹窗、进度弹窗 - 列表布局: 紧凑单行、灰色图标按钮、重命名功能、删除ConfirmModal - 生成配音: toast替换为ProgressModal - 私有音色显示: 描述改为createdAt日期 - 七牛上传: 修复upload_stream参数、修正put_stream参数名 - MiniMax后端: 新增Provider+Service,TTS/克隆/音色列表切到MiniMax - 前端默认音色: tianxin_xiaoling - Rust: 新增voice命令、本地音频存储、配音生成功能 - 新增shot统计组件、脚本编辑器优化
This commit is contained in:
@@ -4,9 +4,11 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
||||
|
||||
## 项目概述
|
||||
|
||||
美家卡智影 (Meijiaka AI Video) - AI 视频创作平台。一个 AI 驱动的桌面应用,采用 **Tauri + React + FastAPI** 混合架构,用户可以通过 AI 生成脚本、创建数字人视频,自动生成字幕,最终本地合成完整的营销视频。
|
||||
美家卡智剪 (Meijiaka Smart Cut) - AI 短视频剪辑桌面应用。一个 AI 驱动的桌面应用,采用 **Tauri + React + FastAPI** 混合架构,用户导入长视频素材,AI 根据脚本自动分镜切割,配合语音克隆/TTS 生成配音,最终合成带字幕的成品短视频。
|
||||
|
||||
### 环境要求
|
||||
核心设计理念:**轻量云账号 + 全本地业务数据** - 云端只存储用户认证,所有项目/脚本/媒体都存在用户本地。
|
||||
|
||||
## 环境要求
|
||||
|
||||
| 组件 | 版本要求 |
|
||||
|------|----------|
|
||||
@@ -15,8 +17,6 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
||||
| Rust | 1.70+ |
|
||||
| Docker | 20+ (可选,用于数据库) |
|
||||
|
||||
核心设计理念:**轻量云账号 + 全本地业务数据** - 云端只存储用户认证和使用日志,所有项目/脚本/媒体都存在用户本地。
|
||||
|
||||
## 架构
|
||||
|
||||
### 混合架构
|
||||
@@ -27,11 +27,10 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
||||
|
||||
### 存储策略
|
||||
|
||||
核心设计理念:**轻量云账号 + 全本地业务数据** - 云端只存储用户认证和使用日志,所有项目/脚本/媒体都存在用户本地。
|
||||
核心设计理念:**轻量云账号 + 全本地业务数据** - 云端只存储用户认证,所有项目/脚本/媒体都存在用户本地。
|
||||
|
||||
- **云端**: PostgreSQL 只存储 2 张表:`users` (用户账户)、`model_usage_logs` (用量统计)
|
||||
- `avatars` 表已废弃:数字人名片元数据现在纯本地存储 `avatars.json`
|
||||
- **本地**: JSON 文件存储项目/脚本/分镜数据、数字人元数据,用户磁盘存储媒体文件,FFmpeg 处理视频合成
|
||||
- **云端**: PostgreSQL 只存储 `users` (用户账户) 表
|
||||
- **本地**: JSON 文件存储项目/脚本/分镜数据,用户磁盘存储媒体文件,FFmpeg 处理视频合成
|
||||
- **缓存/队列**: Redis + Async Engine Scheduler 处理异步任务
|
||||
|
||||
### 混合通信模式
|
||||
@@ -41,12 +40,6 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
||||
| HTTP → FastAPI | AI 生成、认证、配置管理 | `client.get/post/put/delete()` |
|
||||
| Tauri IPC → Rust | FFmpeg 视频处理、本地文件系统 | `ipc.request()` 或直接 `invoke()` |
|
||||
|
||||
**通信模块**:
|
||||
- `tauri-app/src/api/client.ts` - HTTP 客户端,自动处理 camelCase/snake_case 转换
|
||||
- `tauri-app/src/api/ipc.ts` - IPC 客户端
|
||||
- `tauri-app/src/api/modules/localStorage.ts` - 本地项目存储(走 IPC)
|
||||
- `tauri-app/src/api/modules/videoComposite.ts` - 视频合成(走 IPC)
|
||||
|
||||
### AI Provider 架构
|
||||
|
||||
后端 AI 模块采用多 Provider 设计:
|
||||
@@ -55,9 +48,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
||||
- `app/ai/providers/*` - 具体实现(OpenAI、火山引擎、KlingAI 等)
|
||||
- `app/ai/prompts/` - 提示词模板文件
|
||||
|
||||
支持的 AI 平台:火山方舟(推荐)、OpenAI、百度文心一言、阿里云通义千问、KlingAI(数字人视频生成)。
|
||||
|
||||
模型配置文件:`python-api/config/ai_models.yaml`(支持热重载)
|
||||
支持的 AI 平台:火山方舟(推荐)、OpenAI、KlingAI(TTS/声音克隆)。
|
||||
|
||||
### Token 管理
|
||||
|
||||
@@ -67,11 +58,14 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
||||
- 并发安全(双重检查锁定)
|
||||
- 支持 JWT、OAuth2 等多种策略
|
||||
|
||||
### 数据流
|
||||
### Async Engine 异步调度
|
||||
|
||||
1. **脚本生成**: 用户输入 → FastAPI AI 代理 → 标准化输出 → 前端保存到本地 JSON
|
||||
2. **数字人视频**: 后端调用 KlingAI API → 返回视频 URL → 前端下载并本地存储
|
||||
3. **视频合成**: 前端 → Tauri IPC → Rust 后端 → FFmpeg → 渲染最终视频文件
|
||||
项目不使用 Celery,采用自定义 Async Engine:
|
||||
- **`AsyncEngine`**: 每 ~10s 执行 `tick()`,加载运行中任务,按类型分组并行分发
|
||||
- **`JobRegistry`**: Redis-based 任务 CRUD
|
||||
- **`SlotManager`**: Redis Lua 原子脚本实现并发槽位抢占/释放
|
||||
|
||||
已注册槽位:Video (18), Image (9), Script (10), Subtitle (5), Copy (5), TTS, Avatar (2)
|
||||
|
||||
### 本地存储结构(用户机器)
|
||||
|
||||
@@ -82,34 +76,33 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
||||
│ └── {project_id}/
|
||||
│ ├── meta.json # 项目元数据
|
||||
│ ├── segments.json # 脚本/分镜数据
|
||||
│ └── assets/ # 媒体文件
|
||||
├── avatars/
|
||||
│ └── {avatar_id}/
|
||||
│ ├── meta.json # 数字人名片配置
|
||||
│ └── source.mp4 # 源视频
|
||||
└── cache/ # 临时文件
|
||||
│ ├── media/ # 导入的原始素材
|
||||
│ ├── shots/ # 自动切割后的片段
|
||||
│ ├── audio/ # TTS 生成的音频
|
||||
│ └── assets/ # 资源文件(封面、成品等)
|
||||
├── products/ # 成品视频
|
||||
├── avatars.json # 形象列表(本地)
|
||||
└── cache/ # 缓存目录
|
||||
```
|
||||
|
||||
## 目录结构
|
||||
|
||||
```
|
||||
ai-meijiaka/
|
||||
meijiaka-zj/
|
||||
├── python-api/ # FastAPI 后端服务
|
||||
│ ├── app/
|
||||
│ │ ├── api/v1/ # REST API 端点
|
||||
│ │ ├── ai/ # AI 模型路由和 Provider
|
||||
│ │ ├── ai/prompts/ # 提示词模板文件
|
||||
│ │ ├── core/ # 安全、配置、异常处理
|
||||
│ │ ├── core/ # 安全、配置、Token 管理器、异常处理
|
||||
│ │ ├── crud/ # 数据访问层
|
||||
│ │ ├── db/ # 数据库配置
|
||||
│ │ ├── models/ # SQLAlchemy 数据模型
|
||||
│ │ ├── schemas/ # Pydantic 验证模型
|
||||
│ │ ├── services/ # 业务逻辑和 AI 服务代理
|
||||
│ │ ├── scheduler/ # Async Engine 统一异步调度器
|
||||
│ │ ├── config.py # 配置管理
|
||||
│ │ ├── scheduler/ # Async Engine 异步任务调度
|
||||
│ │ ├── config.py # Pydantic Settings 配置管理
|
||||
│ │ └── main.py # 应用入口
|
||||
│ ├── config/ # AI 模型配置(YAML)
|
||||
│ ├── tests/ # pytest 测试套件
|
||||
│ ├── scripts/ # 管理和测试脚本
|
||||
│ ├── config/ # AI 模型配置(YAML),支持热重载
|
||||
│ └── docker-compose.yml # Docker 服务编排
|
||||
│
|
||||
├── tauri-app/ # Tauri 桌面应用
|
||||
@@ -120,34 +113,17 @@ ai-meijiaka/
|
||||
│ │ │ └── modules/ # API 模块封装
|
||||
│ │ ├── components/ # 可复用 React 组件
|
||||
│ │ ├── pages/ # 页面组件(路由)
|
||||
│ │ │ └── VideoCreation/ # 6 步视频创作流程
|
||||
│ │ ├── store/ # Zustand 全局状态管理
|
||||
│ │ ├── hooks/ # 自定义 React Hooks
|
||||
│ │ └── utils/ # 前端工具函数
|
||||
│ ├── src-tauri/ # Rust 后端
|
||||
│ │ ├── src/
|
||||
│ │ │ ├── lib.rs # Tauri 应用入口,命令注册
|
||||
│ │ │ ├── commands/ # 按领域拆分的命令模块
|
||||
│ │ │ │ ├── asset.rs # 资源文件操作
|
||||
│ │ │ │ ├── auth_state.rs # 认证状态管理
|
||||
│ │ │ │ ├── avatar.rs # 数字人头像管理
|
||||
│ │ │ │ ├── product.rs # 产品相关
|
||||
│ │ │ │ └── project.rs # 项目存储操作
|
||||
│ │ │ ├── storage/ # 存储引擎分层
|
||||
│ │ │ │ ├── mod.rs # 模块导出
|
||||
│ │ │ │ ├── paths.rs # 路径计算
|
||||
│ │ │ │ ├── engine.rs # 核心存储引擎(原子写+文件锁)
|
||||
│ │ │ │ ├── auth.rs # 认证存储
|
||||
│ │ │ │ ├── project.rs # 项目存储
|
||||
│ │ │ │ ├── avatar.rs # 头像存储
|
||||
│ │ │ │ └── cache.rs # 缓存存储
|
||||
│ │ │ ├── ffmpeg_cmd.rs # FFmpeg 命令封装
|
||||
│ │ │ ├── video_processing.rs # 视频合成逻辑
|
||||
│ │ │ ├── api_proxy.rs # Python API 代理
|
||||
│ │ │ ├── avatar_cache.rs # 头像视频缓存管理
|
||||
│ │ │ └── utils.rs # 通用工具函数
|
||||
│ │ ├── binaries/ # 嵌入的 FFmpeg 可执行文件
|
||||
│ │ └── Cargo.toml # Rust 依赖配置
|
||||
│ └── package.json # NPM 依赖和脚本
|
||||
│ └── src-tauri/ # Rust 后端
|
||||
│ └── src/
|
||||
│ ├── lib.rs # Tauri 应用入口,命令注册
|
||||
│ ├── commands/ # 按领域拆分的命令模块
|
||||
│ ├── storage/ # 存储引擎(原子写+校验+文件锁)
|
||||
│ ├── ffmpeg_cmd.rs # FFmpeg 命令封装
|
||||
│ └── video_processing.rs # 视频合成逻辑
|
||||
│
|
||||
└── docs/ # 开发文档
|
||||
```
|
||||
@@ -167,31 +143,13 @@ make docker-run # 使用 Docker Compose 启动所有服务(db, redi
|
||||
make run # 启动 FastAPI 开发服务器
|
||||
make scheduler # 启动 Async Engine Scheduler
|
||||
make lint # 运行代码检查 (ruff + mypy)
|
||||
make lint-semantic # 语义层禁词检查
|
||||
make format # 格式化代码
|
||||
make test # 运行所有测试
|
||||
make test-cov # 运行测试并生成覆盖率报告
|
||||
make security # 运行安全扫描 (bandit + pip-audit)
|
||||
|
||||
# 手动方式
|
||||
# 安装依赖
|
||||
python -m venv venv && source venv/bin/activate
|
||||
pip install -e ".[dev]"
|
||||
|
||||
# 启动 PostgreSQL + Redis(必需)
|
||||
docker-compose up -d db redis
|
||||
|
||||
# 启动 FastAPI 开发服务器
|
||||
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
|
||||
|
||||
# 启动 Async Engine Scheduler(另开终端)
|
||||
python -m app.scheduler.main
|
||||
|
||||
# 代码质量
|
||||
black app/ # 格式化代码(行宽 100)
|
||||
ruff check app/ # 代码检查
|
||||
mypy app/ # 严格类型检查
|
||||
bandit -c pyproject.toml -r app/ # 安全扫描
|
||||
pip-audit # 依赖漏洞检测
|
||||
python scripts/check_config_architecture.py # 检查配置架构一致性
|
||||
make ci # 运行所有 CI 检查
|
||||
make clean # 清理缓存文件
|
||||
|
||||
# 导出 OpenAPI 文档到前端
|
||||
python3 -c "
|
||||
@@ -202,39 +160,11 @@ import json
|
||||
print(json.dumps(app.openapi(), indent=2, ensure_ascii=False))
|
||||
" > ../tauri-app/src/api/generated/openapi.json
|
||||
|
||||
# 测试
|
||||
pytest # 运行所有测试
|
||||
pytest tests/test_script.py -v # 运行单个测试文件
|
||||
pytest --cov=app # 覆盖率报告
|
||||
|
||||
# Docker
|
||||
docker-compose up -d # 启动所有服务(db, redis, api, scheduler)
|
||||
|
||||
# 端口占用检查
|
||||
lsof -i :8080 # 检查 8080 端口占用
|
||||
# 数据库迁移(修改模型后)
|
||||
alembic revision --autogenerate -m "description"
|
||||
alembic upgrade head
|
||||
```
|
||||
|
||||
**可用 Makefile 命令:**
|
||||
|
||||
| 命令 | 用途 |
|
||||
|------|------|
|
||||
| `make help` | 显示帮助信息 |
|
||||
| `make install` | 安装生产依赖(使用 lock 文件)|
|
||||
| `make dev` | 安装开发依赖并配置 pre-commit |
|
||||
| `make update-lock` | 更新 requirements.lock |
|
||||
| `make lint` | 运行代码检查 (ruff + mypy) |
|
||||
| `make format` | 格式化代码 (black + ruff) |
|
||||
| `make format-check` | 检查代码格式(不修改)|
|
||||
| `make test` | 运行测试 |
|
||||
| `make test-cov` | 运行测试并生成覆盖率报告 |
|
||||
| `make security` | 运行安全扫描 |
|
||||
| `make run` | 启动开发服务器 |
|
||||
| `make scheduler` | 启动 Async Engine Scheduler |
|
||||
| `make docker-run` | Docker Compose 启动全部服务 |
|
||||
| `make docker-down` | 停止 Docker 服务 |
|
||||
| `make clean` | 清理缓存文件 |
|
||||
| `make ci` | 运行所有 CI 检查 |
|
||||
|
||||
### 前端 (tauri-app)
|
||||
|
||||
```bash
|
||||
@@ -260,58 +190,13 @@ npm run stylelint # CSS 检查
|
||||
# 测试
|
||||
npm run test # 运行 Vitest
|
||||
npm run test:coverage # 覆盖率报告
|
||||
npm run test:ui # 打开 Vitest UI
|
||||
|
||||
# 类型生成
|
||||
npm run gen:api # 从 OpenAPI schema 生成 TypeScript 类型
|
||||
```
|
||||
|
||||
### 数据库迁移
|
||||
|
||||
项目使用 Alembic 进行数据库迁移:
|
||||
|
||||
```bash
|
||||
cd python-api
|
||||
|
||||
# 生成新迁移(修改模型后)
|
||||
alembic revision --autogenerate -m "description"
|
||||
|
||||
# 应用迁移
|
||||
alembic upgrade head
|
||||
|
||||
# 回滚迁移
|
||||
alembic downgrade -1
|
||||
```
|
||||
|
||||
### 开发提示
|
||||
|
||||
- **Tauri 调试**: 使用 `npm run tauri dev` 时,Rust 后端日志在终端输出,前端日志在浏览器控制台
|
||||
- **本地项目路径**: 项目数据保存在 `~/Documents/Meijiaka/projects/{project_id}/`
|
||||
- **配置修改**: AI 模型配置 `python-api/config/ai_models.yaml` 支持热重载,无需重启服务
|
||||
- **类型同步**: 修改后端 API 后,记得重新导出 OpenAPI 并运行 `npm run gen:api`
|
||||
- **Async Engine Scheduler**: 系统使用 Slot-Based Scheduler 统一调度所有第三方异步任务:
|
||||
- `video` - 数字人视频生成(18 slots)
|
||||
- `avatar_clone` - 形象克隆(2 slots)
|
||||
- `image` - 图片生成(9 slots)
|
||||
- `subtitle` - 字幕生成(5 slots)
|
||||
- `copy` - 文案提取(5 slots)
|
||||
- **任务状态**: 任务状态唯一真相源为后端 Redis,`taskStore` 不持久化,启动时从后端 `GET /tasks` 查询
|
||||
- **项目数据**: 项目元数据和分镜数据通过 IPC 显式写入本地文件,不通过 Zustand persist 持久化
|
||||
- **字幕渲染**: 使用 `assjs` 库进行 ASS/SSA 字幕预览渲染,WASM 和 Worker 文件通过 Vite 插件复制到 `public/` 目录,修改资源路径后需要检查插件配置
|
||||
|
||||
## 开发规范
|
||||
|
||||
### 后端 (Python)
|
||||
|
||||
- **格式化**: Black (行宽: 100)
|
||||
- **检查**: Ruff
|
||||
- **类型**: MyPy (strict 模式)
|
||||
- **架构**: API → Service → CRUD → Model,禁止跨层调用
|
||||
- **数据库**: 始终使用异步 SQLAlchemy,事务在 API 层控制
|
||||
- **AI 集成**: 无论使用什么提供者,输出 Schema 必须保持一致,在 Service 层标准化
|
||||
- **提示词**: 所有提示词放在 `app/ai/prompts/` 单独文件,不硬编码
|
||||
- **配置管理**: 所有配置通过 `from app.config import get_settings` 读取,禁止直接使用 `os.getenv()`,所有配置项必须在 `Settings` 类中定义
|
||||
|
||||
### 配置管理强制规范
|
||||
|
||||
**架构层级:**
|
||||
@@ -328,28 +213,16 @@ alembic downgrade -1
|
||||
- **敏感信息**(API Keys、Secrets)必须通过环境变量注入
|
||||
- **业务默认值**可以硬编码在 `Settings` 中
|
||||
|
||||
**添加新配置流程:**
|
||||
1. 在 `app/config.py` 的 `Settings` 类中添加字段定义
|
||||
2. 使用 `Field(default=..., description="...")` 提供默认值和说明
|
||||
3. 敏感信息使用 `str | None = None` 类型
|
||||
4. 更新 `.env.example` 文档
|
||||
### 统一术语表(语义治理)
|
||||
|
||||
### Rust (Tauri 后端)
|
||||
整个后端划分为 6 个语义层级,每一层只使用属于该层的术语:
|
||||
|
||||
- **格式化**: `rustfmt`(默认配置)
|
||||
- **检查**: `cargo clippy`(零警告)
|
||||
- **模块组织**: 命令按领域拆分到 `src/commands/{domain}.rs`,在 `lib.rs` 中注册
|
||||
- **存储分层**: 存储逻辑按领域拆分到 `src/storage/{domain}.rs`
|
||||
- **命令参数**: Tauri IPC 命令必须使用 Args 结构体接收参数:
|
||||
```rust
|
||||
#[derive(Deserialize)]
|
||||
#[serde(rename_all = "camelCase")]
|
||||
pub struct SaveProjectMetaArgs {
|
||||
pub project_id: String,
|
||||
pub data: serde_json::Value,
|
||||
}
|
||||
```
|
||||
- **禁止**: 命令函数直接使用 camelCase 参数名(会产生 `non_snake_case` 警告)
|
||||
| 业务概念 | 官方术语 | 使用层级 | 禁止使用的别名 |
|
||||
|---------|---------|---------|--------------|
|
||||
| 视频分镜 | `Segment` | Layer 3-6 | `shot`, `scene_desc` |
|
||||
| 调度器工作单元 | `Job` | Layer 4 | `task` |
|
||||
| 供应商侧任务 | `ProviderJob` | Layer 2 | `kling_task`, `volc_task` |
|
||||
| 供应商任务 ID | `provider_task_id` | Layer 2-4 | `kling_task_id`, `video_task_id` |
|
||||
|
||||
### 本地数据存储规范(Tauri/Rust)
|
||||
|
||||
@@ -357,13 +230,12 @@ alembic downgrade -1
|
||||
```
|
||||
Layer 1: 页面组件(Pages/Components) — 只操作 Store,禁止直接调用 IPC save
|
||||
Layer 2: Zustand Store(内存状态) — Immer 不可变更新
|
||||
Layer 3: PersistManager(持久化协调) — debounce 批量、flush 强制、错误上报
|
||||
Layer 4: API 模块(localStorageApi 等) — 类型安全的 IPC 调用封装
|
||||
Layer 5: Rust StorageEngine(文件系统) — sanitize + atomic_write + file_lock
|
||||
Layer 3: API 模块(localStorageApi 等) — 类型安全的 IPC 调用封装
|
||||
Layer 4: Rust StorageEngine(文件系统) — sanitize + atomic_write + file_lock
|
||||
```
|
||||
|
||||
**强制规范:**
|
||||
1. **禁止页面组件直接调用 `localProjectApi.saveXxx()`** — 必须通过 Store → PersistManager
|
||||
1. **禁止页面组件直接调用 `localProjectApi.saveXxx()`** — 必须通过 Store
|
||||
2. **禁止 Rust 命令函数直接 `fs::write`** — 必须通过 `StorageEngine::atomic_write_json`
|
||||
3. **所有 ID 参数必须 `sanitize_id`** — 路径参数白名单校验(`[a-zA-Z0-9_-]+`)
|
||||
4. **所有 JSON 写操作必须原子化** — 临时文件 + `fs::rename`
|
||||
@@ -374,17 +246,29 @@ Layer 5: Rust StorageEngine(文件系统) — sanitize + atomic_write + f
|
||||
- `sanitize_filename(name)` — 提取纯文件名,拒绝目录组件
|
||||
- `atomic_write_json(path, value)` — 先写 `.tmp` 再 rename,防崩溃截断
|
||||
- `with_file_lock(path, f)` — 文件锁保护 RMW 操作
|
||||
- `read_json<T>(path)` — 安全读取,文件不存在返回 `None`,损坏返回 `Err`
|
||||
- `read_json<T>(path)` — 安全读取,文件不存在返回 `None`
|
||||
|
||||
### 前端 (TypeScript/React)
|
||||
### 前端状态管理
|
||||
|
||||
- **类型**: 严格 TypeScript 模式
|
||||
- **组件**: 函数组件 + Hooks
|
||||
- **状态管理**: Zustand 管理全局状态,Immer 处理不可变更新
|
||||
- **数据获取**: SWR 缓存,自动 localStorage 降级
|
||||
- **API 客户端**: 从后端 OpenAPI schema 自动生成类型
|
||||
- **命名风格**: camelCase(自动与后端 snake_case 转换)
|
||||
- **本地存储**: 项目数据通过 Tauri IPC 保存到 `~/Documents/Meijiaka/projects/`
|
||||
七个专门的 Zustand store:
|
||||
|
||||
| Store | 职责 | 持久化 |
|
||||
|-------|------|--------|
|
||||
| `authStore` | JWT、UserInfo、登录/登出 | Tauri `auth.json` |
|
||||
| `projectStore` | 分镜、currentStep、选题、封面配置 | 仅 UI 标志持久化 |
|
||||
| `taskStore` | 异步任务状态/进度/消息 | **无**(真相源在后端 Redis) |
|
||||
| `uiStore` | Toast 通知队列 | 无 |
|
||||
| `progressStore` | 全局进度模态框 | 无 |
|
||||
| `settingsStore` | 主题模式、用户偏好 | localStorage |
|
||||
| `voiceStore` | 语音/形象选择状态 | 无 |
|
||||
|
||||
`projectStore` **不自动保存**。数据在显式过渡点持久化到磁盘。
|
||||
|
||||
### 代码风格
|
||||
|
||||
- **Python**: Black (行宽 100), Ruff, MyPy
|
||||
- **TypeScript/React**: 严格 TypeScript, 函数组件 + Hooks, Zustand + Immer, Prettier (semi=true, singleQuote=true)
|
||||
- **Rust**: rustfmt, cargo clippy
|
||||
|
||||
### 提交规范
|
||||
|
||||
@@ -399,7 +283,7 @@ chore: 构建/工具
|
||||
|
||||
## 环境配置
|
||||
|
||||
### 后端 (.env)
|
||||
### 后端 (.env) 关键配置
|
||||
|
||||
```bash
|
||||
# 数据库
|
||||
@@ -414,15 +298,9 @@ ACCESS_TOKEN_EXPIRE_MINUTES=10080
|
||||
VOLCENGINE_API_KEY=your-volcengine-key
|
||||
VOLCENGINE_CAPTION_APPID=your-caption-appid
|
||||
VOLCENGINE_CAPTION_TOKEN=your-caption-token
|
||||
OPENAI_API_KEY=sk-your-openai-key
|
||||
KLINGAI_ACCESS_KEY=your-kling-access-key
|
||||
KLINGAI_SECRET_KEY=your-kling-secret-key
|
||||
|
||||
# 七牛云存储(数字人视频持久化)
|
||||
QINIU_ACCESS_KEY=your-qiniu-access-key
|
||||
QINIU_SECRET_KEY=your-qiniu-secret-key
|
||||
QINIU_VIDEO_BUCKET=media-bucket
|
||||
QINIU_IMAGE_BUCKET=image-bucket
|
||||
OPENAI_API_KEY=sk-your-openai-key
|
||||
|
||||
# CORS 配置
|
||||
CORS_ORIGINS=http://localhost:1420,http://127.0.0.1:1420,http://localhost:8080
|
||||
@@ -442,20 +320,14 @@ CORS_ORIGINS=http://localhost:1420,http://127.0.0.1:1420,http://localhost:8080
|
||||
| `python-api/app/api/v1/*.py` | API 端点定义 |
|
||||
| `python-api/app/ai/model_router.py` | AI 模型路由和降级 |
|
||||
| `python-api/app/services/*.py` | 业务逻辑和 AI 响应标准化 |
|
||||
| `python-api/config/ai_models.yaml` | AI 模型配置 |
|
||||
| `tauri-app/src/App.tsx` | 主 React 组件 |
|
||||
| `python-api/config/ai_models.yaml` | AI 模型配置(热重载)|
|
||||
| `python-api/app/core/token_manager.py` | API Token 缓存与自动刷新 |
|
||||
| `tauri-app/src/api/client.ts` | 智能路由的 API 客户端 |
|
||||
| `tauri-app/src/store/projectStore.ts` | 项目状态管理 |
|
||||
| `tauri-app/src-tauri/src/lib.rs` | Rust 命令注册 |
|
||||
| `tauri-app/src-tauri/src/commands/project.rs` | 项目存储 IPC 命令 |
|
||||
| `tauri-app/src-tauri/src/storage/engine.rs` | 核心存储引擎(原子写+校验)|
|
||||
| `tauri-app/src-tauri/src/ffmpeg_cmd.rs` | FFmpeg 命令封装 |
|
||||
| `tauri-app/src-tauri/src/video_processing.rs` | FFmpeg 视频合成 |
|
||||
| `tauri-app/src-tauri/src/avatar_cache.rs` | 头像视频缓存管理 |
|
||||
| `python-api/app/core/token_manager.py` | API Token 缓存与自动刷新 |
|
||||
| `python-api/app/config.py` | Pydantic Settings 配置管理 |
|
||||
| `tauri-app/src/pages/VideoCreation/SubtitleBurning.tsx` | 字幕压制页面(ASS 字幕渲染) |
|
||||
| `tauri-app/src/hooks/useAssJsRenderer.ts` | assjs 字幕渲染 Hook |
|
||||
| `tauri-app/src/utils/assGenerator.ts` | ASS 字幕文件生成工具 |
|
||||
|
||||
## 额外开发文档
|
||||
|
||||
@@ -463,56 +335,15 @@ CORS_ORIGINS=http://localhost:1420,http://127.0.0.1:1420,http://localhost:8080
|
||||
|
||||
| 文档 | 主题 |
|
||||
|------|------|
|
||||
| `docs/video-generation-flow.md` | 完整视频生成流程说明 |
|
||||
| `docs/kling-api-dev.md` | KlingAI 数字人视频 API 对接开发文档 |
|
||||
| `docs/app-update-system.md` | 应用自动更新系统设计 |
|
||||
| `docs/anytocopy-integration.md` | 版权素材集成说明 |
|
||||
| `docs/anytocopy-api.md` | 版权素材 API 文档 |
|
||||
| `docs/volcengine-video-caption-api.md` | 火山引擎字幕 API 对接 |
|
||||
| `docs/qiniu-kodo-python-sdk-guide.md` | 七牛云存储 SDK 集成指南 |
|
||||
| `docs/database-design.md` | 数据库设计文档 |
|
||||
| `docs/unified-async-scheduler.md` | 统一异步调度器设计 |
|
||||
| `docs/volcengine-video-caption-api.md` | 火山引擎字幕 API 对接 |
|
||||
| `docs/semantic-refactoring-plan.md` | 后端语义重构计划 |
|
||||
| `docs/migrate-avatars-to-local.md` | 头像数据迁移到本地说明 |
|
||||
|
||||
## 统一术语表(语义治理)
|
||||
## 视频创作核心流程
|
||||
|
||||
后端代码已完成语义治理重构,所有开发必须遵守统一术语表,禁止使用废弃别名。
|
||||
|
||||
整个后端划分为 6 个语义层级,每一层只使用属于该层的术语:
|
||||
|
||||
```
|
||||
Layer 6: Presentation (API Schema / 前端适配层) → Segment, Human, Job, Script
|
||||
Layer 5: Application (API 路由) → Segment, Human, Job, Project
|
||||
Layer 4: Orchestration (Scheduler / SlotManager) → Job, JobRecord, Slot, Handler
|
||||
Layer 3: Domain (Service / 业务逻辑) → Segment, Human, VideoComposition, Caption
|
||||
Layer 2: Adapter (Provider Client) → KlingJob, KlingElement, VolcJob, ProviderTaskId
|
||||
Layer 1: Infrastructure (DB / Redis / HTTP) → 底层技术术语
|
||||
```
|
||||
|
||||
### 术语对照表
|
||||
|
||||
| 业务概念 | 官方术语 | 使用层级 | 禁止使用的别名 |
|
||||
|---------|---------|---------|--------------|
|
||||
| 视频分镜 | `Segment` | Layer 3-6 | `shot`, `scene_desc` |
|
||||
| 数字人形象 | `Human` / `Avatar` | Layer 3-6(DB 用 `avatar`,API 用 `human_id`) | `element`, `character` |
|
||||
| 调度器工作单元 | `Job` | Layer 4 | `task` |
|
||||
| 供应商侧任务 | `ProviderJob` | Layer 2 | `kling_task`, `volc_task` |
|
||||
| 供应商任务 ID | `provider_task_id` | Layer 2-4 | `kling_task_id`, `video_task_id`, `image_task_id` |
|
||||
| 分镜状态 | `SegmentStatus` | Layer 3-4 | 裸字符串 |
|
||||
| 调度器状态 | `JobStatus` | Layer 4 | 裸字符串 |
|
||||
| 形象克隆状态 | `AvatarCloneStatus` | Layer 3 | 裸字符串 |
|
||||
| Kling 原始状态 | `KlingTaskStatus` | **Layer 2 仅限** | 泄漏到 Layer 3+ |
|
||||
|
||||
### 分层禁令
|
||||
|
||||
1. **API 层 (`app/api/v1/`)**:禁止出现 `element_id`, `kling_task_id`, `shot_type`, `omni`
|
||||
2. **Scheduler 层 (`app/scheduler/`)**:禁止出现 `task_id`(应为 `job_id`),禁止构造供应商 prompt 语法
|
||||
3. **Service 层 (`app/services/`)**:禁止出现 `<<<element_1>>>` 等供应商专用语法
|
||||
4. **Provider 层 (`app/ai/providers/`)**:允许使用 `element_id`, `kling_task_id`, `KlingTaskStatus`
|
||||
|
||||
### 类型禁令
|
||||
|
||||
- 跨层传递的接口禁止裸用 `dict[str, Any]`。`params`、`result`、`changes` 等字段必须使用 Pydantic 模型或 TypedDict
|
||||
- 状态字段禁止使用裸字符串,必须使用对应的 `StrEnum`
|
||||
- CRUD 层 `obj_in` 禁止裸字典,必须使用 `CreateSchema` / `UpdateSchema`
|
||||
1. **脚本生成** - AI 生成或粘贴口播文案,自动拆分为带预估时长的分镜
|
||||
2. **视频粗剪** - 导入长视频素材,按脚本时长自动切割为片段
|
||||
3. **语音配音** - AI 声音克隆 + TTS 合成,为每段生成配音音频
|
||||
4. **字幕压制** - 基于音频自动对齐时间轴,ASS 字幕渲染并压制到视频
|
||||
5. **封面制作** - 提取视频首帧并叠加标题样式生成封面
|
||||
6. **视频合成** - FFmpeg 拼接视频片段,替换原声为 TTS 音频,导出成品
|
||||
|
||||
@@ -0,0 +1,625 @@
|
||||
# MiniMax API 开发文档
|
||||
|
||||
> 文档来源:https://platform.minimaxi.com / https://platform.minimax.io
|
||||
> 整理时间:2026-04-21
|
||||
|
||||
---
|
||||
|
||||
## 1. 服务地址
|
||||
|
||||
| 区域 | Base URL |
|
||||
|------|----------|
|
||||
| 中国大陆 | `https://api.minimaxi.com` |
|
||||
| 国际 | `https://api.minimax.io` |
|
||||
|
||||
## 2. 认证方式
|
||||
|
||||
**Bearer Token**
|
||||
|
||||
```http
|
||||
Authorization: Bearer {API_KEY}
|
||||
```
|
||||
|
||||
- 在 MiniMax 平台 **账户管理 > 接口密钥** 中创建
|
||||
- Key 前缀通常为 `sk-api-` 或 `sk-cp-`
|
||||
|
||||
## 3. 通用响应格式
|
||||
|
||||
```json
|
||||
{
|
||||
"base_resp": {
|
||||
"status_code": 0,
|
||||
"status_msg": "success"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- `status_code = 0` 表示成功
|
||||
- 非零表示错误,具体含义见各接口错误码说明
|
||||
|
||||
---
|
||||
|
||||
## 4. 文本生成(Chat Completion)
|
||||
|
||||
### 4.1 接入方式
|
||||
|
||||
支持三种 SDK/协议接入:
|
||||
|
||||
1. **HTTP 直连** — 原生 REST API
|
||||
2. **OpenAI SDK** — 兼容 `/chat/completions` 接口
|
||||
3. **Anthropic SDK** — 兼容 Claude 风格接口
|
||||
|
||||
### 4.2 模型列表
|
||||
|
||||
| 模型名称 | 上下文窗口 | 说明 |
|
||||
|----------|-----------|------|
|
||||
| `MiniMax-M2.7` | 204,800 | 旗舰模型,自迭代能力,约 60 tps |
|
||||
| `MiniMax-M2.7-highspeed` | 204,800 | M2.7 高速版,约 100 tps |
|
||||
| `MiniMax-M2.5` | 204,800 | 性能与性价比平衡,约 60 tps |
|
||||
| `MiniMax-M2.5-highspeed` | 204,800 | M2.5 高速版,约 100 tps |
|
||||
| `MiniMax-M2.1` | 204,800 | 多语言编程增强 |
|
||||
| `MiniMax-M2` | 204,800 | Agentic 能力、高级推理 |
|
||||
|
||||
### 4.3 HTTP 接口
|
||||
|
||||
```http
|
||||
POST /v1/text/chatcompletion_v2
|
||||
Content-Type: application/json
|
||||
Authorization: Bearer {API_KEY}
|
||||
```
|
||||
|
||||
**请求体示例:**
|
||||
|
||||
```json
|
||||
{
|
||||
"model": "MiniMax-M2.7",
|
||||
"messages": [
|
||||
{ "role": "system", "content": "You are a helpful assistant." },
|
||||
{ "role": "user", "content": "Hello!" }
|
||||
],
|
||||
"max_tokens": 1024,
|
||||
"temperature": 0.7,
|
||||
"stream": false
|
||||
}
|
||||
```
|
||||
|
||||
**响应示例:**
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "04ecb5d9b1921ae0fb0e8da9017a5474",
|
||||
"choices": [
|
||||
{
|
||||
"finish_reason": "stop",
|
||||
"index": 0,
|
||||
"message": {
|
||||
"content": "Hello! How can I assist you?",
|
||||
"role": "assistant",
|
||||
"name": "MiniMax AI",
|
||||
"reasoning_content": "..."
|
||||
}
|
||||
}
|
||||
],
|
||||
"created": 1755153113,
|
||||
"model": "MiniMax-M2.7",
|
||||
"usage": {
|
||||
"total_tokens": 249,
|
||||
"prompt_tokens": 26,
|
||||
"completion_tokens": 223,
|
||||
"completion_tokens_details": {
|
||||
"reasoning_tokens": 214
|
||||
}
|
||||
},
|
||||
"base_resp": {
|
||||
"status_code": 0,
|
||||
"status_msg": ""
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 4.4 关键字段说明
|
||||
|
||||
| 字段 | 类型 | 说明 |
|
||||
|------|------|------|
|
||||
| `model` | string | 模型名称 |
|
||||
| `messages` | array | 对话消息列表,支持 system/user/assistant |
|
||||
| `max_tokens` | int | 最大输出 token 数 |
|
||||
| `temperature` | float | 采样温度,0-1 |
|
||||
| `stream` | bool | 是否流式输出 |
|
||||
| `reasoning_content` | string | 推理过程内容(M2.7 等模型支持) |
|
||||
|
||||
---
|
||||
|
||||
## 5. 语音合成(TTS)
|
||||
|
||||
### 5.1 能力概述
|
||||
|
||||
- **同步 TTS**:单次最多 10,000 字符,推荐 ≤3000 字符用非流式,>3000 用流式
|
||||
- **异步长文本 TTS**:单次最多 100 万字符,适合书籍/长文本
|
||||
- 支持 **300+ 系统音色** + **自定义克隆音色**
|
||||
- 支持 **40 种语言**
|
||||
- 可调节音量、音调、语速、输出格式
|
||||
|
||||
### 5.2 模型列表
|
||||
|
||||
| 模型 | 说明 |
|
||||
|------|------|
|
||||
| `speech-2.8-hd` | 最新 HD 模型,语气词渲染,音色相似度极高 |
|
||||
| `speech-2.8-turbo` | 最新 Turbo 模型,速度优先 |
|
||||
| `speech-2.6-hd` | HD 模型,韵律优秀,克隆相似度高 |
|
||||
| `speech-2.6-turbo` | Turbo 模型,支持 40 语言 |
|
||||
| `speech-02-hd` | 韵律稳定,复刻相似度和音质突出 |
|
||||
| `speech-02-turbo` | 小语种增强,性能出色 |
|
||||
| `speech-01-hd` | 早期 HD 模型 |
|
||||
| `speech-01-turbo` | 早期 Turbo 模型 |
|
||||
|
||||
### 5.3 同步 TTS HTTP 接口
|
||||
|
||||
```http
|
||||
POST /v1/t2a_v2
|
||||
Content-Type: application/json
|
||||
Authorization: Bearer {API_KEY}
|
||||
```
|
||||
|
||||
**备用地址:** `https://api-bj.minimaxi.com/v1/t2a_v2`
|
||||
|
||||
**请求体:**
|
||||
|
||||
```json
|
||||
{
|
||||
"model": "speech-2.8-hd",
|
||||
"text": "你好,这是测试文本。<#1.5#>这是停顿后的内容。",
|
||||
"voice_id": "male-qn-qingse",
|
||||
"speed": 1.0,
|
||||
"vol": 1.0,
|
||||
"pitch": 0,
|
||||
"audio_sample_rate": 32000,
|
||||
"bitrate": 128000,
|
||||
"format": "mp3",
|
||||
"language_boost": "auto",
|
||||
"subtitle_enable": false,
|
||||
"output_format": "url"
|
||||
}
|
||||
```
|
||||
|
||||
**关键字段说明:**
|
||||
|
||||
| 字段 | 类型 | 说明 |
|
||||
|------|------|------|
|
||||
| `model` | string | TTS 模型版本 |
|
||||
| `text` | string | 待合成文本,≤10000 字符。支持 `<#x#>` 停顿标记(x 单位秒,0.01-99.99)和 `(laughs)` 等语气词标签(仅 2.8 模型) |
|
||||
| `voice_id` | string | 音色 ID。系统预设音色或克隆/设计音色 |
|
||||
| `speed` | float | 语速,默认 1.0 |
|
||||
| `vol` | float | 音量,默认 1.0 |
|
||||
| `pitch` | int | 音调,默认 0 |
|
||||
| `format` | string | 音频格式:`mp3`/`pcm`/`flac`/`wav`(wav 仅非流式) |
|
||||
| `audio_sample_rate` | int | 采样率:16000/24000/32000/44100/48000 |
|
||||
| `bitrate` | int | 比特率:16000/32000/64000/128000 |
|
||||
| `language_boost` | string | 小语种增强:`auto` 或具体语言名 |
|
||||
| `subtitle_enable` | bool | 是否生成字幕(句子级时间戳) |
|
||||
| `output_format` | string | 输出形式:`url`(有效期 24h)或 `hex` |
|
||||
| `stream` | bool | 是否流式输出 |
|
||||
|
||||
### 5.4 异步长文本 TTS
|
||||
|
||||
**创建任务:**
|
||||
|
||||
```http
|
||||
POST /v1/t2a_async_create
|
||||
```
|
||||
|
||||
**查询任务:**
|
||||
|
||||
```http
|
||||
GET /v1/t2a_async_query?task_id={task_id}
|
||||
```
|
||||
|
||||
- 单次最多 100 万字符
|
||||
- 支持通过 `file_id` 上传文本文件作为输入
|
||||
- 返回音频 URL 有效期 9 小时
|
||||
- 支持句子级时间戳(字幕)
|
||||
|
||||
---
|
||||
|
||||
## 6. 语音克隆(Voice Cloning)
|
||||
|
||||
### 6.1 快速复刻
|
||||
|
||||
从用户上传的音频文件快速克隆音色。
|
||||
|
||||
```http
|
||||
POST /v1/voice_clone
|
||||
Content-Type: application/json
|
||||
Authorization: Bearer {API_KEY}
|
||||
```
|
||||
|
||||
**请求体:**
|
||||
|
||||
```json
|
||||
{
|
||||
"model": "speech-2.8-hd",
|
||||
"voice_name": "我的音色",
|
||||
"audio_url": "https://example.com/voice_sample.mp3",
|
||||
"sample_audio_url": "https://example.com/enhance_sample.mp3"
|
||||
}
|
||||
```
|
||||
|
||||
**说明:**
|
||||
- 支持单声道/立体声音频
|
||||
- `audio_url`:目标克隆音频(5-30秒,人声干净)
|
||||
- `sample_audio_url`:可选,示例音频提升克隆质量
|
||||
- 克隆本身**不收费**,首次使用克隆音色进行 TTS 合成时收费
|
||||
- 克隆音色为**临时音色**,需在 7 天内(168 小时)至少使用一次 TTS 合成,否则会被清除
|
||||
|
||||
### 6.2 查询克隆任务
|
||||
|
||||
```http
|
||||
GET /v1/voice_clone?task_id={task_id}
|
||||
```
|
||||
|
||||
任务完成后返回 `voice_id`,可直接用于 TTS 接口。
|
||||
|
||||
### 6.3 音色设计(Voice Design)
|
||||
|
||||
根据文字描述生成自定义音色。
|
||||
|
||||
```http
|
||||
POST /v1/voice_design
|
||||
```
|
||||
|
||||
**说明:**
|
||||
- 推荐模型:`speech-02-hd`
|
||||
- 生成的 `voice_id` 同样可用于 TTS
|
||||
- 也是临时音色,7 天内需使用一次
|
||||
|
||||
---
|
||||
|
||||
## 7. 视频生成
|
||||
|
||||
### 7.1 能力概述
|
||||
|
||||
- 文生视频、图生视频、首尾帧视频、主体参考视频
|
||||
- 支持镜头控制(15 种运镜指令)
|
||||
- 异步任务模式:创建 → 查询 → 下载
|
||||
|
||||
### 7.2 模型列表
|
||||
|
||||
| 模型 | 说明 |
|
||||
|------|------|
|
||||
| `MiniMax-Hailuo-2.3` | 最新视频模型,肢体动作、面部表情、物理表现突破 |
|
||||
| `MiniMax-Hailuo-2.3-Fast` | 图生视频高速版,性价比更高 |
|
||||
| `MiniMax-Hailuo-02` | 1080P 原生,SOTA 指令遵循,极致物理表现 |
|
||||
| `T2V-01-Director` | 导演模式,支持运镜指令 |
|
||||
| `T2V-01` | 标准文生视频 |
|
||||
|
||||
### 7.3 文生视频
|
||||
|
||||
```http
|
||||
POST /v1/video_generation
|
||||
Content-Type: application/json
|
||||
Authorization: Bearer {API_KEY}
|
||||
```
|
||||
|
||||
**请求体:**
|
||||
|
||||
```json
|
||||
{
|
||||
"model": "MiniMax-Hailuo-2.3",
|
||||
"prompt": "A cat wearing sunglasses, sitting on a beach chair. [Push in] The camera slowly zooms in on the cat's face.",
|
||||
"prompt_optimizer": true,
|
||||
"duration": 6,
|
||||
"resolution": "768P",
|
||||
"callback_url": "https://your-domain.com/callback"
|
||||
}
|
||||
```
|
||||
|
||||
**关键字段:**
|
||||
|
||||
| 字段 | 类型 | 说明 |
|
||||
|------|------|------|
|
||||
| `model` | string | 视频模型 |
|
||||
| `prompt` | string | 视频描述,≤2000 字符。支持 `[命令]` 运镜语法 |
|
||||
| `prompt_optimizer` | bool | 是否自动优化 prompt,默认 true |
|
||||
| `duration` | int | 时长(秒):6 或 10,默认 6 |
|
||||
| `resolution` | string | 分辨率:`720P`/`768P`/`1080P` |
|
||||
| `callback_url` | string | 回调地址(可选) |
|
||||
|
||||
**运镜指令(15 种):**
|
||||
|
||||
| 类型 | 指令 |
|
||||
|------|------|
|
||||
| 平移 | `[Truck left]`, `[Truck right]` |
|
||||
| 摇镜 | `[Pan left]`, `[Pan right]` |
|
||||
| 推/拉 | `[Push in]`, `[Pull out]` |
|
||||
| 升降 | `[Pedestal up]`, `[Pedestal down]` |
|
||||
| 俯仰 | `[Tilt up]`, `[Tilt down]` |
|
||||
| 变焦 | `[Zoom in]`, `[Zoom out]` |
|
||||
| 晃动 | `[Shake]` |
|
||||
| 跟踪 | `[Tracking shot]` |
|
||||
| 固定 | `[Static shot]` |
|
||||
|
||||
- 组合运镜:`[Pan left,Pedestal up]`(最多 3 个同时)
|
||||
- 顺序运镜:按文本顺序出现
|
||||
|
||||
### 7.4 查询视频任务
|
||||
|
||||
```http
|
||||
GET /v1/video_generation?task_id={task_id}
|
||||
```
|
||||
|
||||
状态:`processing` → `success`/`failed`
|
||||
|
||||
成功时返回 `file_id`,用文件管理 API 下载。
|
||||
|
||||
### 7.5 图生视频 / 首尾帧视频 / 主体参考视频
|
||||
|
||||
- **图生视频**:提供首帧图片 URL
|
||||
- **首尾帧视频**:提供首帧 + 尾帧图片 URL
|
||||
- **主体参考视频**:提供参考图片,保持主体一致性
|
||||
|
||||
具体字段与文生视频类似,额外增加图片 URL 参数。
|
||||
|
||||
### 7.6 Video Agent(模板视频)
|
||||
|
||||
基于预设模板快速生成视频。
|
||||
|
||||
| 模板 ID | 模板名称 | 说明 |
|
||||
|---------|---------|------|
|
||||
| 392747428568649728 | Diving | 上传照片生成跳水视频 |
|
||||
| 393769180141805569 | Run for Life | 宠物照片 + 野兽类型,生成野外求生视频 |
|
||||
| 397087679467597833 | Transformers | 汽车照片生成变形机甲视频 |
|
||||
| 393881433990066176 | Still rings routine | 上传照片生成吊环体操视频 |
|
||||
| 393498001241890824 | Weightlifting | 宠物照片生成举重视频 |
|
||||
| 393488336655310850 | Climbing | 上传照片生成攀岩视频 |
|
||||
|
||||
---
|
||||
|
||||
## 8. 图片生成
|
||||
|
||||
```http
|
||||
POST /v1/image_generation
|
||||
Content-Type: application/json
|
||||
Authorization: Bearer {API_KEY}
|
||||
```
|
||||
|
||||
**模型:** `image-01` / `image-01-live`(手绘/卡通风格增强)
|
||||
|
||||
**能力:**
|
||||
- 文生图
|
||||
- 图生图(以人物为主体的图像参考)
|
||||
- 支持自定义宽高比和分辨率
|
||||
|
||||
---
|
||||
|
||||
## 9. 音乐生成
|
||||
|
||||
```http
|
||||
POST /v1/music_generation
|
||||
Content-Type: application/json
|
||||
Authorization: Bearer {API_KEY}
|
||||
```
|
||||
|
||||
**模型:** `music-2.6`
|
||||
|
||||
**能力:**
|
||||
- 根据音乐描述(prompt)和歌词生成歌曲
|
||||
- 支持翻唱(基于参考音频一键生成翻唱版本)
|
||||
- 支持风格迁移和自动歌词提取
|
||||
|
||||
---
|
||||
|
||||
## 10. 文件管理
|
||||
|
||||
用于配合视频/音频生成任务上传和下载文件。
|
||||
|
||||
| 操作 | 方法 | 路径 |
|
||||
|------|------|------|
|
||||
| 上传文件 | POST | `/v1/files/upload` |
|
||||
| 文件列表 | GET | `/v1/files` |
|
||||
| 获取文件信息 | GET | `/v1/files/{file_id}` |
|
||||
| 下载文件内容 | GET | `/v1/files/{file_id}/content` |
|
||||
| 删除文件 | DELETE | `/v1/files/{file_id}` |
|
||||
|
||||
**支持格式:**
|
||||
|
||||
| 类型 | 格式 |
|
||||
|------|------|
|
||||
| 文档 | `pdf`, `docx`, `txt`, `jsonl` |
|
||||
| 音频 | `mp3`, `m4a`, `wav` |
|
||||
|
||||
**容量限制:**
|
||||
- 总容量:100GB
|
||||
- 单个文件:512MB
|
||||
|
||||
---
|
||||
|
||||
## 11. 回调机制(Callback)
|
||||
|
||||
视频生成等异步任务支持 Webhook 回调。
|
||||
|
||||
### 配置方式
|
||||
|
||||
在创建任务时传入 `callback_url`。
|
||||
|
||||
### 验证流程
|
||||
|
||||
1. MiniMax 首次向回调地址发送验证请求,body 中包含 `challenge` 字段
|
||||
2. 你的服务器需在 3 秒内原样返回 `{"challenge": "..."}`
|
||||
3. 验证通过后,后续任务状态变更会自动推送
|
||||
|
||||
### 推送格式
|
||||
|
||||
```json
|
||||
{
|
||||
"task_id": "115334141465231360",
|
||||
"status": "success",
|
||||
"file_id": "205258526306433",
|
||||
"base_resp": {
|
||||
"status_code": 0,
|
||||
"status_msg": "success"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 12. 定价参考
|
||||
|
||||
| 服务 | 计费方式 | 参考价格 |
|
||||
|------|---------|---------|
|
||||
| 文本生成 (M2) | 按 Token | $0.3/1M input, $1.2/1M output |
|
||||
| 文本生成 (M2.5/2.7 highspeed) | 按 Token | $0.6/1M input, $2/1M output |
|
||||
| TTS (Turbo) | 按字符 | $60/1M 字符 |
|
||||
| TTS (HD) | 按字符 | $100/1M 字符 |
|
||||
| 视频生成 (Hailuo 6s 768P) | 按任务 | ~$0.33/条 |
|
||||
| 图片生成 | 按张 | ~$0.0035/张 |
|
||||
|
||||
---
|
||||
|
||||
## 13. 错误码速查
|
||||
|
||||
| status_code | 说明 |
|
||||
|-------------|------|
|
||||
| 0 | 成功 |
|
||||
| 1000 | 参数错误 |
|
||||
| 1001 | 鉴权失败 |
|
||||
| 1002 | 余额不足 |
|
||||
| 1003 | 请求频率限制 |
|
||||
| 1004 | 服务内部错误 |
|
||||
| 1005 | 任务不存在 |
|
||||
| 1006 | 任务处理中 |
|
||||
| 1007 | 任务失败 |
|
||||
| 1008 | 文件不存在 |
|
||||
| 1009 | 文件格式不支持 |
|
||||
| 1010 | 文本过长 |
|
||||
| 1011 | 非法字符过多(TTS 异步) |
|
||||
|
||||
---
|
||||
|
||||
## 14. 接入建议
|
||||
|
||||
### 14.1 TTS 接入
|
||||
|
||||
```python
|
||||
import httpx
|
||||
|
||||
async def minimax_tts(text: str, voice_id: str, api_key: str) -> str:
|
||||
"""同步 TTS,返回音频 URL"""
|
||||
async with httpx.AsyncClient() as client:
|
||||
resp = await client.post(
|
||||
"https://api.minimax.io/v1/t2a_v2",
|
||||
headers={"Authorization": f"Bearer {api_key}"},
|
||||
json={
|
||||
"model": "speech-2.8-hd",
|
||||
"text": text,
|
||||
"voice_id": voice_id,
|
||||
"speed": 1.0,
|
||||
"output_format": "url",
|
||||
"format": "mp3",
|
||||
},
|
||||
timeout=60,
|
||||
)
|
||||
data = resp.json()
|
||||
if data.get("base_resp", {}).get("status_code") != 0:
|
||||
raise Exception(data["base_resp"]["status_msg"])
|
||||
return data["data"]["audio_url"]
|
||||
```
|
||||
|
||||
### 14.2 视频生成接入
|
||||
|
||||
```python
|
||||
import httpx
|
||||
import asyncio
|
||||
|
||||
async def create_video(prompt: str, api_key: str) -> str:
|
||||
"""创建视频任务,返回 task_id"""
|
||||
async with httpx.AsyncClient() as client:
|
||||
resp = await client.post(
|
||||
"https://api.minimax.io/v1/video_generation",
|
||||
headers={"Authorization": f"Bearer {api_key}"},
|
||||
json={
|
||||
"model": "MiniMax-Hailuo-2.3",
|
||||
"prompt": prompt,
|
||||
"duration": 6,
|
||||
"resolution": "768P",
|
||||
},
|
||||
)
|
||||
data = resp.json()
|
||||
return data["data"]["task_id"]
|
||||
|
||||
async def poll_video(task_id: str, api_key: str) -> str:
|
||||
"""轮询视频任务,返回 file_id"""
|
||||
async with httpx.AsyncClient() as client:
|
||||
for _ in range(60): # 最多等 10 分钟
|
||||
resp = await client.get(
|
||||
f"https://api.minimax.io/v1/video_generation?task_id={task_id}",
|
||||
headers={"Authorization": f"Bearer {api_key}"},
|
||||
)
|
||||
data = resp.json()
|
||||
status = data["data"].get("status")
|
||||
if status == "success":
|
||||
return data["data"]["file_id"]
|
||||
if status == "failed":
|
||||
raise Exception("视频生成失败")
|
||||
await asyncio.sleep(10)
|
||||
raise TimeoutError("视频生成超时")
|
||||
```
|
||||
|
||||
### 14.3 语音克隆 + TTS 完整流程
|
||||
|
||||
```python
|
||||
async def clone_and_synthesize(audio_url: str, text: str, api_key: str):
|
||||
"""克隆音色并合成语音"""
|
||||
async with httpx.AsyncClient() as client:
|
||||
# 1. 提交克隆
|
||||
resp = await client.post(
|
||||
"https://api.minimax.io/v1/voice_clone",
|
||||
headers={"Authorization": f"Bearer {api_key}"},
|
||||
json={
|
||||
"model": "speech-2.8-hd",
|
||||
"voice_name": "克隆音色",
|
||||
"audio_url": audio_url,
|
||||
},
|
||||
)
|
||||
task_id = resp.json()["data"]["task_id"]
|
||||
|
||||
# 2. 轮询克隆结果
|
||||
voice_id = None
|
||||
for _ in range(120):
|
||||
resp = await client.get(
|
||||
f"https://api.minimax.io/v1/voice_clone?task_id={task_id}",
|
||||
headers={"Authorization": f"Bearer {api_key}"},
|
||||
)
|
||||
data = resp.json()["data"]
|
||||
if data.get("status") == "succeed":
|
||||
voice_id = data["voice_id"]
|
||||
break
|
||||
await asyncio.sleep(5)
|
||||
|
||||
if not voice_id:
|
||||
raise Exception("克隆失败或超时")
|
||||
|
||||
# 3. 用克隆音色合成 TTS
|
||||
resp = await client.post(
|
||||
"https://api.minimax.io/v1/t2a_v2",
|
||||
headers={"Authorization": f"Bearer {api_key}"},
|
||||
json={
|
||||
"model": "speech-2.8-hd",
|
||||
"text": text,
|
||||
"voice_id": voice_id,
|
||||
"output_format": "url",
|
||||
},
|
||||
)
|
||||
return resp.json()["data"]["audio_url"]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 15. 注意事项
|
||||
|
||||
1. **临时音色有效期**:克隆音色和设计的音色均为临时音色,需在 7 天内至少使用一次 TTS 合成,否则会被清除。
|
||||
2. **TTS URL 有效期**:同步 TTS 返回的音频 URL 有效期 24 小时,异步长文本 TTS 的 URL 有效期 9 小时。
|
||||
3. **流式输出**:同步 TTS 支持流式返回(`stream: true`),适合实时语音场景。
|
||||
4. **语言增强**:`language_boost` 设为 `auto` 可让模型自动判断语言,提升小语种和方言效果。
|
||||
5. **视频分辨率与时长**:不同模型支持的分辨率和时长组合不同,详见第 7 节表格。
|
||||
6. **文本停顿标记**:TTS 文本中可用 `<#1.5#>` 控制停顿,用 `(laughs)` 等插入语气词(仅 2.8 模型)。
|
||||
@@ -0,0 +1,263 @@
|
||||
# MiniMax 语音接口接入方案
|
||||
|
||||
> 目标:用 MiniMax 语音能力(TTS + 克隆 + 设计)替换现有 Kling TTS
|
||||
|
||||
---
|
||||
|
||||
## 一、架构总览
|
||||
|
||||
```
|
||||
┌─────────────────┐ HTTP ┌─────────────────────┐ HTTP ┌──────────────┐
|
||||
│ tauri-app │ ────────────→ │ python-api │ ────────────→ │ MiniMax API │
|
||||
│ │ │ │ │ │
|
||||
│ VoiceDubbing │ synthesize │ MiniMaxTTSService │ /v1/t2a_v2 │ TTS 同步 │
|
||||
│ VoiceMaterial │ clone/query │ MiniMaxVoiceClone │ /v1/voice_ │ 语音克隆 │
|
||||
│ │ │ │ clone │ │
|
||||
└─────────────────┘ └─────────────────────┘ └──────────────┘
|
||||
```
|
||||
|
||||
**接入原则:**
|
||||
- 最小侵入:复用现有 `voices.json`、`VoiceMaterial` 类型、进度弹窗
|
||||
- 数据兼容:MiniMax `voice_id` 直接存入 `voiceId` 字段(字符串,无格式冲突)
|
||||
- 流程对齐:上传 → 提交克隆 → 轮询 → ready,与现有 Kling 克隆流程完全一致
|
||||
|
||||
---
|
||||
|
||||
## 二、后端接入(python-api)
|
||||
|
||||
### 2.1 新增配置项(`app/config.py`)
|
||||
|
||||
```python
|
||||
MINIMAX_API_KEY: str = "" # Bearer token
|
||||
MINIMAX_BASE_URL: str = "https://api.minimax.io" # 国际站
|
||||
```
|
||||
|
||||
`.env` 新增:
|
||||
```bash
|
||||
MINIMAX_API_KEY=sk-api-xxxx
|
||||
MINIMAX_BASE_URL=https://api.minimax.io
|
||||
```
|
||||
|
||||
### 2.2 新增 Provider(`app/ai/providers/minimax_provider.py`)
|
||||
|
||||
封装 MiniMax HTTP API,提供:
|
||||
|
||||
| 方法 | 对应接口 | 用途 |
|
||||
|------|---------|------|
|
||||
| `tts_sync(text, voice_id, speed, ...)` | POST `/v1/t2a_v2` | 同步 TTS |
|
||||
| `tts_async_create(text, voice_id, ...)` | POST `/v1/t2a_async_create` | 异步长文本 TTS |
|
||||
| `tts_async_query(task_id)` | GET `/v1/t2a_async_query` | 查询异步任务 |
|
||||
| `clone_voice(audio_url, voice_name)` | POST `/v1/voice_clone` | 提交克隆 |
|
||||
| `query_clone_task(task_id)` | GET `/v1/voice_clone` | 查询克隆任务 |
|
||||
| `design_voice(description, voice_name)` | POST `/v1/voice_design` | 音色设计 |
|
||||
| `upload_file(file_bytes, mime_type)` | POST `/v1/files/upload` | 上传文件 |
|
||||
|
||||
**关键设计:**
|
||||
- 使用 `httpx.AsyncClient` 异步调用
|
||||
- Token 直接走 `Authorization: Bearer` Header,无需额外鉴权逻辑
|
||||
- 错误统一抛 `Exception(f"MiniMax API error: {message}")`
|
||||
|
||||
### 2.3 新增 Service(`app/services/minimax_tts_service.py`)
|
||||
|
||||
提供业务层封装,与现有 `TTSService` 接口对齐:
|
||||
|
||||
```python
|
||||
class MiniMaxTTSService:
|
||||
async def synthesize_sync(self, text, voice_id, speed=1.0) -> str:
|
||||
"""同步 TTS,返回音频 URL"""
|
||||
# 调用 provider.tts_sync,返回 audio_url
|
||||
|
||||
async def synthesize_async(self, text, voice_id, speed=1.0) -> dict:
|
||||
"""异步长文本 TTS,返回 task_id"""
|
||||
# 调用 provider.tts_async_create
|
||||
|
||||
async def query_async_task(self, task_id) -> dict:
|
||||
"""查询异步任务状态"""
|
||||
# 调用 provider.tts_async_query
|
||||
|
||||
async def clone_voice(self, audio_url: str, voice_name: str) -> str:
|
||||
"""提交克隆,返回 task_id"""
|
||||
|
||||
async def query_clone_task(self, task_id: str) -> dict:
|
||||
"""查询克隆状态,返回 {status, voice_id, trial_url}"""
|
||||
|
||||
async def design_voice(self, description: str, voice_name: str) -> str:
|
||||
"""音色设计,返回 task_id(或 voice_id)"""
|
||||
```
|
||||
|
||||
### 2.4 修改 API 路由(`app/api/v1/voice.py`)
|
||||
|
||||
**现有接口替换:**
|
||||
|
||||
| 现有端点 | 修改内容 |
|
||||
|---------|---------|
|
||||
| `POST /voice/synthesize` | 内部调用 `MiniMaxTTSService.synthesize_sync()` 替代 Kling |
|
||||
| `POST /voice/clone/submit` | 调用 `MiniMaxTTSService.clone_voice()` |
|
||||
| `GET /voice/clone/query/{task_id}` | 调用 `MiniMaxTTSService.query_clone_task()` |
|
||||
| `POST /voice/upload` | 保持现有七牛上传逻辑(克隆音频先传七牛再给 MiniMax) |
|
||||
|
||||
**请求/响应 Schema 不变**:前端无需修改字段名。
|
||||
|
||||
**注意:** MiniMax 同步 TTS 返回的音频 URL 有效期 24 小时。如果前端需要长期保存,仍需走「下载 blob → 上传七牛 → 本地保存」的流程(复用现有 VoiceDubbing 逻辑)。
|
||||
|
||||
### 2.5 预设音色更新
|
||||
|
||||
现有 `TTSService.PRESET_VOICES` 中硬编码的 Kling 字符串 voice_id 全部废弃。
|
||||
|
||||
替换为 MiniMax 系统预设音色(需从 MiniMax 平台获取真实 voice_id 列表)。
|
||||
|
||||
**方案 A(推荐):启动时动态拉取**
|
||||
- 服务启动时调用 MiniMax `GET /v1/voices/preset` 获取官方音色列表
|
||||
- 缓存到内存,前端请求 `/voice/voices` 时返回
|
||||
|
||||
**方案 B(硬编码常用音色)**
|
||||
- 先硬编码 5-10 个常用中文音色(需从 MiniMax 平台获取真实 voice_id)
|
||||
- 后续再扩展为动态拉取
|
||||
|
||||
> 建议先用方案 B 快速跑通,再迭代为方案 A。
|
||||
|
||||
---
|
||||
|
||||
## 三、前端适配(tauri-app)
|
||||
|
||||
### 3.1 需要修改的文件
|
||||
|
||||
| 文件 | 修改内容 |
|
||||
|------|---------|
|
||||
| `src/api/modules/voice.ts` | `synthesizeTTS` 参数不变(text/voiceId/speed),无需改调用方 |
|
||||
| `src/store/voiceStore.ts` | 预设音色加载逻辑适配 MiniMax 返回格式 |
|
||||
| `src/pages/VideoCreation/VoiceDubbing.tsx` | 生成流程复用现有逻辑(synthesizeTTS → 下载 → 上传七牛 → 保存) |
|
||||
| `src/pages/ContentManagement/VoiceMaterialLibrary.tsx` | 克隆/轮询/列表逻辑不变,只需确保后端 Schema 兼容 |
|
||||
|
||||
### 3.2 VoiceMaterial 数据兼容
|
||||
|
||||
现有 `VoiceMaterial` 结构:
|
||||
|
||||
```ts
|
||||
interface VoiceMaterial {
|
||||
id: string; // 克隆任务ID
|
||||
name: string;
|
||||
voiceId: string; // MiniMax voice_id(字符串,直接兼容)
|
||||
sourceUrl: string; // 七牛云原始音频URL
|
||||
trialUrl?: string; // 试听URL
|
||||
status: 'pending' | 'processing' | 'ready' | 'failed';
|
||||
createdAt: string;
|
||||
}
|
||||
```
|
||||
|
||||
**无需改动**:MiniMax 的 `voice_id` 也是字符串,格式完全兼容。
|
||||
|
||||
### 3.3 生成配音流程(VoiceDubbing)
|
||||
|
||||
现有流程(Kling):
|
||||
```
|
||||
synthesizeTTS → 返回 audio_url → fetch 下载 blob → uploadAudio(七牛) → saveAudio(本地)
|
||||
```
|
||||
|
||||
替换后(MiniMax):
|
||||
```
|
||||
synthesizeTTS → 返回 audio_url → fetch 下载 blob → uploadAudio(七牛) → saveAudio(本地)
|
||||
```
|
||||
|
||||
**前端完全不变**,只需确保后端 `/voice/synthesize` 返回的 `audio_url` 是 MiniMax 的 URL(24小时有效)。
|
||||
|
||||
### 3.4 音色克隆流程(VoiceMaterialLibrary)
|
||||
|
||||
现有流程(Kling):
|
||||
```
|
||||
上传音频 → 七牛 → submitCloneTask → 轮询 queryCloneTask → ready → 保存 voices.json
|
||||
```
|
||||
|
||||
替换后(MiniMax):
|
||||
```
|
||||
上传音频 → 七牛 → submitCloneTask → 轮询 queryCloneTask → ready → 保存 voices.json
|
||||
```
|
||||
|
||||
**前端完全不变**,只需后端 `/voice/clone/submit` 和 `/voice/clone/query/{task_id}` 内部换成 MiniMax。
|
||||
|
||||
---
|
||||
|
||||
## 四、异步任务调度(Scheduler)
|
||||
|
||||
### 4.1 现状
|
||||
|
||||
当前使用自定义 Async Engine(Redis 槽位调度)管理 Kling 克隆任务轮询。
|
||||
|
||||
### 4.2 方案选择
|
||||
|
||||
**方案 A:复用现有 Async Engine(推荐)**
|
||||
- 新增 `MiniMaxCloneHandler`,占用 2 个槽位(与 AvatarHandler 同级)
|
||||
- 优点:统一状态机、统一日志、统一并发控制
|
||||
- 缺点:需要新增 Handler 和 Redis key
|
||||
|
||||
**方案 B:前端直接轮询**
|
||||
- 克隆提交后,前端自己 `setInterval` 轮询 `/voice/clone/query/{task_id}`
|
||||
- 优点:无需改 scheduler
|
||||
- 缺点:用户关闭页面后轮询中断,任务状态可能丢失
|
||||
|
||||
**推荐方案 A**,保持后端统一调度。
|
||||
|
||||
### 4.3 MiniMaxCloneHandler 设计
|
||||
|
||||
```python
|
||||
class MiniMaxCloneHandler(AsyncHandler):
|
||||
slots = 3
|
||||
redis_key = "minimax:clone_slots"
|
||||
|
||||
async def handle(self, job: JobRecord):
|
||||
result = await minimax_service.query_clone_task(job.provider_task_id)
|
||||
if result.status == "succeed":
|
||||
return StateChange.complete(voice_id=result.voice_id, trial_url=result.trial_url)
|
||||
elif result.status == "failed":
|
||||
return StateChange.fail(error=result.error_message)
|
||||
else:
|
||||
return StateChange.noop() # 继续轮询
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 五、开发顺序
|
||||
|
||||
### Phase 1:基础 Provider + TTS(1-2h)
|
||||
1. 新增 `MiniMaxProvider`(HTTP 封装)
|
||||
2. 新增 `MiniMaxTTSService`
|
||||
3. 修改 `/voice/synthesize` 路由,替换 Kling TTS
|
||||
4. 配置 `.env` + `config.py`
|
||||
5. **验证**:VoiceDubbing 生成配音是否正常
|
||||
|
||||
### Phase 2:语音克隆(2-3h)
|
||||
1. 新增 `/voice/clone/submit` + `/voice/clone/query` 后端逻辑
|
||||
2. 新增 `MiniMaxCloneHandler` 到 Scheduler
|
||||
3. **验证**:VoiceMaterialLibrary 上传音频 → 克隆 → ready 完整链路
|
||||
|
||||
### Phase 3:预设音色(1h)
|
||||
1. 替换 `PRESET_VOICES` 为 MiniMax 系统音色
|
||||
2. 修改 `/voice/voices` 返回格式(如有变化)
|
||||
3. **验证**:系统预设音色列表正常,选中后 TTS 可用
|
||||
|
||||
### Phase 4:收尾(1h)
|
||||
1. 清理 Kling TTS 相关代码(标记废弃或删除)
|
||||
2. 更新 `AGENTS.md` 文档
|
||||
3. 端到端测试
|
||||
|
||||
---
|
||||
|
||||
## 六、风险与注意点
|
||||
|
||||
| 风险 | 应对 |
|
||||
|------|------|
|
||||
| MiniMax 预设音色 voice_id 未知 | 方案 B 先硬编码,后续动态拉取 |
|
||||
| TTS 返回 URL 24h 过期 | 复用现有「下载→七牛→本地」流程,无需改前端 |
|
||||
| 克隆音色 7 天过期 | 前端列表中标记「临时音色」,提示用户定期使用 |
|
||||
| 异步长文本 TTS 暂时不需要 | Phase 1 只接同步 TTS,异步后续按需扩展 |
|
||||
| 现有 Kling 视频/形象克隆仍保留 | 只替换语音相关,Kling Video/Element 不动 |
|
||||
|
||||
---
|
||||
|
||||
## 七、需要你确认的
|
||||
|
||||
1. **预设音色**:MiniMax 平台里有哪些你想用的系统预设音色?我可以先硬编码 5-10 个。
|
||||
2. **服务区域**:用 `api.minimax.io`(国际)还是 `api.minimaxi.com`(国内)?
|
||||
3. **异步 TTS**:当前场景单次旁白 ≤1000 字,同步 TTS 够用了,异步长文本暂时不接,对吗?
|
||||
4. **音色设计**:是否需要「根据文字描述生成虚拟音色」的能力?还是只接「上传音频克隆」?
|
||||
5. **删除 Kling 代码**:接入完成后是否彻底删除 Kling TTS/克隆代码,还是保留做 fallback?
|
||||
@@ -38,10 +38,14 @@ VOLCENGINE_BASE_URL=https://ark.cn-beijing.volces.com/api/v3
|
||||
VOLCENGINE_CAPTION_APPID=your-caption-appid
|
||||
VOLCENGINE_CAPTION_TOKEN=your-caption-token
|
||||
|
||||
# 可灵 AI(必需,用于视频生成)
|
||||
# 可灵 AI(必需,用于视频生成、形象克隆)
|
||||
KLINGAI_ACCESS_KEY=your-kling-access-key
|
||||
KLINGAI_SECRET_KEY=your-kling-secret-key
|
||||
|
||||
# MiniMax(必需,用于语音合成、语音克隆)
|
||||
MINIMAX_API_KEY=sk-api-your-minimax-key
|
||||
MINIMAX_BASE_URL=https://api.minimaxi.com
|
||||
|
||||
# OpenAI(可选)
|
||||
# OPENAI_API_KEY=sk-your-openai-key
|
||||
# OPENAI_BASE_URL=https://api.openai.com/v1
|
||||
|
||||
@@ -1 +1 @@
|
||||
{"http:Pn60lJXcaOGKvMjn5qv-OMr7wR1lp1p8QG7Ul6NK:media-liche": {"upHosts": ["http://upload-z2.qiniup.com", "http://up-z2.qiniup.com"], "ioHosts": ["http://iovip-z2.qbox.me"], "rsHosts": ["http://rs-z2.qbox.me"], "rsfHosts": ["http://rsf-z2.qbox.me"], "apiHosts": ["http://api-z2.qiniu.com"], "deadline": 1776740815}, "http:Pn60lJXcaOGKvMjn5qv-OMr7wR1lp1p8QG7Ul6NK:img-liche": {"upHosts": ["http://upload-z2.qiniup.com", "http://up-z2.qiniup.com"], "ioHosts": ["http://iovip-z2.qbox.me"], "rsHosts": ["http://rs-z2.qbox.me"], "rsfHosts": ["http://rsf-z2.qbox.me"], "apiHosts": ["http://api-z2.qiniu.com"], "deadline": 1776433218}}
|
||||
{"http:Pn60lJXcaOGKvMjn5qv-OMr7wR1lp1p8QG7Ul6NK:media-liche": {"upHosts": ["http://upload-z2.qiniup.com", "http://up-z2.qiniup.com"], "ioHosts": ["http://iovip-z2.qbox.me"], "rsHosts": ["http://rs-z2.qbox.me"], "rsfHosts": ["http://rsf-z2.qbox.me"], "apiHosts": ["http://api-z2.qiniu.com"], "deadline": 1776849652}, "http:Pn60lJXcaOGKvMjn5qv-OMr7wR1lp1p8QG7Ul6NK:img-liche": {"upHosts": ["http://upload-z2.qiniup.com", "http://up-z2.qiniup.com"], "ioHosts": ["http://iovip-z2.qbox.me"], "rsHosts": ["http://rs-z2.qbox.me"], "rsfHosts": ["http://rsf-z2.qbox.me"], "apiHosts": ["http://api-z2.qiniu.com"], "deadline": 1776433218}}
|
||||
@@ -201,5 +201,3 @@
|
||||
}
|
||||
]
|
||||
注意:只输出纯 JSON,不要包含 markdown 代码块或其他说明文字。
|
||||
二、文案内容
|
||||
把所有配音文案"voiceover"组合起来,生成纯文案内容
|
||||
|
||||
@@ -201,5 +201,3 @@
|
||||
}
|
||||
]
|
||||
注意:只输出纯 JSON,不要包含 markdown 代码块或其他说明文字。
|
||||
二、文案内容
|
||||
把所有配音文案"voiceover"组合起来,生成纯文案内容
|
||||
|
||||
@@ -219,6 +219,4 @@
|
||||
"duration": "5s"
|
||||
}
|
||||
]
|
||||
注意:只输出纯 JSON,不要包含 markdown 代码块或其他说明文字。
|
||||
二、文案内容
|
||||
把所有配音文案"voiceover"组合起来,生成纯文案内容
|
||||
注意:只输出纯 JSON,不要包含 markdown 代码块或其他说明文字。
|
||||
@@ -200,6 +200,4 @@
|
||||
"duration": "5s"
|
||||
}
|
||||
]
|
||||
注意:只输出纯 JSON,不要包含 markdown 代码块或其他说明文字。
|
||||
二、文案内容
|
||||
把所有配音文案"voiceover"组合起来,生成纯文案内容
|
||||
注意:只输出纯 JSON,不要包含 markdown 代码块或其他说明文字。
|
||||
@@ -201,5 +201,3 @@
|
||||
}
|
||||
]
|
||||
注意:只输出纯 JSON,不要包含 markdown 代码块或其他说明文字。
|
||||
二、文案内容
|
||||
把所有配音文案"voiceover"组合起来,生成纯文案内容
|
||||
|
||||
@@ -200,6 +200,4 @@
|
||||
"duration": "5s"
|
||||
}
|
||||
]
|
||||
注意:只输出纯 JSON,不要包含 markdown 代码块或其他说明文字。
|
||||
二、文案内容
|
||||
把所有配音文案"voiceover"组合起来,生成纯文案内容
|
||||
注意:只输出纯 JSON,不要包含 markdown 代码块或其他说明文字。
|
||||
@@ -798,6 +798,8 @@ class KlingAIProvider:
|
||||
voice_id: str,
|
||||
voice_language: str = "zh",
|
||||
voice_speed: float = 1.0,
|
||||
voice_volume: float = 1.0,
|
||||
voice_pitch: int = 0,
|
||||
**kwargs,
|
||||
) -> dict[str, Any]:
|
||||
"""
|
||||
@@ -807,9 +809,11 @@ class KlingAIProvider:
|
||||
|
||||
Args:
|
||||
text: 要合成的文本
|
||||
voice_id: 音色ID(官方预设或自定义音色)
|
||||
voice_id: 音色ID(官方预设或或自定义音色)
|
||||
voice_language: 语言 (zh/en)
|
||||
voice_speed: 语速 (0.8-2.0)
|
||||
voice_volume: 音量 (0.5-10.0)
|
||||
voice_pitch: 音调 (-10 到 10)
|
||||
|
||||
Returns:
|
||||
包含音频URL和任务信息的字典
|
||||
@@ -823,6 +827,8 @@ class KlingAIProvider:
|
||||
"voice_id": voice_id,
|
||||
"voice_language": voice_language,
|
||||
"voice_speed": str(voice_speed),
|
||||
"voice_volume": str(voice_volume),
|
||||
"voice_pitch": str(voice_pitch),
|
||||
}
|
||||
|
||||
async with (
|
||||
|
||||
@@ -0,0 +1,231 @@
|
||||
"""
|
||||
MiniMax API Provider
|
||||
====================
|
||||
|
||||
封装 MiniMax 语音相关 HTTP API:
|
||||
- 同步 TTS(/v1/t2a_v2)
|
||||
- 异步长文本 TTS(/v1/t2a_async_create / /v1/t2a_async_query)
|
||||
- 语音克隆(/v1/voice_clone)
|
||||
- 文件上传(/v1/files/upload)
|
||||
|
||||
认证方式:Bearer Token(Authorization Header)
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import Any
|
||||
|
||||
import aiohttp
|
||||
|
||||
from app.config import get_settings
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class MiniMaxProvider:
|
||||
"""MiniMax API 客户端封装"""
|
||||
|
||||
def __init__(self, api_key: str | None = None, base_url: str | None = None):
|
||||
settings = get_settings()
|
||||
self.api_key = api_key or settings.MINIMAX_API_KEY
|
||||
self.base_url = (base_url or settings.MINIMAX_BASE_URL).rstrip("/")
|
||||
|
||||
def _get_headers(self) -> dict[str, str]:
|
||||
return {
|
||||
"Authorization": f"Bearer {self.api_key}",
|
||||
"Content-Type": "application/json",
|
||||
}
|
||||
|
||||
# ==================== TTS 语音合成 ====================
|
||||
|
||||
async def tts_sync(
|
||||
self,
|
||||
text: str,
|
||||
voice_id: str,
|
||||
speed: float = 1.0,
|
||||
vol: float = 1.0,
|
||||
pitch: int = 0,
|
||||
format: str = "mp3",
|
||||
output_format: str = "url",
|
||||
subtitle_enable: bool = False,
|
||||
language_boost: str | None = None,
|
||||
stream: bool = False,
|
||||
**kwargs,
|
||||
) -> dict[str, Any]:
|
||||
"""
|
||||
同步语音合成
|
||||
|
||||
POST /v1/t2a_v2
|
||||
|
||||
Note: MiniMax API requires speed/vol/pitch to be integers,
|
||||
and voice_id/speed/vol/pitch must be inside "voice_setting" object.
|
||||
"""
|
||||
url = f"{self.base_url}/v1/t2a_v2"
|
||||
|
||||
payload: dict[str, Any] = {
|
||||
"model": kwargs.get("model", "speech-2.8-hd"),
|
||||
"text": text,
|
||||
"stream": stream,
|
||||
"output_format": output_format,
|
||||
"voice_setting": {
|
||||
"voice_id": voice_id,
|
||||
"speed": int(speed),
|
||||
"vol": int(vol),
|
||||
"pitch": int(pitch),
|
||||
},
|
||||
"format": format,
|
||||
"subtitle_enable": subtitle_enable,
|
||||
}
|
||||
if language_boost:
|
||||
payload["language_boost"] = language_boost
|
||||
|
||||
async with aiohttp.ClientSession() as session:
|
||||
async with session.post(url, json=payload, headers=self._get_headers()) as resp:
|
||||
data = await resp.json()
|
||||
if data.get("base_resp", {}).get("status_code", -1) != 0:
|
||||
msg = data.get("base_resp", {}).get("status_msg", "Unknown error")
|
||||
raise Exception(f"MiniMax TTS error: {msg}")
|
||||
return data.get("data", {})
|
||||
|
||||
async def tts_async_create(
|
||||
self,
|
||||
text: str,
|
||||
voice_id: str,
|
||||
speed: float = 1.0,
|
||||
vol: float = 1.0,
|
||||
pitch: int = 0,
|
||||
format: str = "mp3",
|
||||
subtitle_enable: bool = False,
|
||||
**kwargs,
|
||||
) -> dict[str, Any]:
|
||||
"""
|
||||
创建异步长文本 TTS 任务
|
||||
|
||||
POST /v1/t2a_async_create
|
||||
"""
|
||||
url = f"{self.base_url}/v1/t2a_async_create"
|
||||
|
||||
payload: dict[str, Any] = {
|
||||
"model": kwargs.get("model", "speech-2.8-hd"),
|
||||
"text": text,
|
||||
"voice_id": voice_id,
|
||||
"speed": int(speed),
|
||||
"vol": int(vol),
|
||||
"pitch": int(pitch),
|
||||
"format": format,
|
||||
"subtitle_enable": subtitle_enable,
|
||||
}
|
||||
|
||||
async with aiohttp.ClientSession() as session:
|
||||
async with session.post(url, json=payload, headers=self._get_headers()) as resp:
|
||||
data = await resp.json()
|
||||
if data.get("base_resp", {}).get("status_code", -1) != 0:
|
||||
msg = data.get("base_resp", {}).get("status_msg", "Unknown error")
|
||||
raise Exception(f"MiniMax TTS async create error: {msg}")
|
||||
return data.get("data", {})
|
||||
|
||||
async def tts_async_query(self, task_id: str) -> dict[str, Any]:
|
||||
"""
|
||||
查询异步 TTS 任务状态
|
||||
|
||||
GET /v1/t2a_async_query?task_id={task_id}
|
||||
"""
|
||||
url = f"{self.base_url}/v1/t2a_async_query?task_id={task_id}"
|
||||
|
||||
async with aiohttp.ClientSession() as session:
|
||||
async with session.get(url, headers=self._get_headers()) as resp:
|
||||
data = await resp.json()
|
||||
if data.get("base_resp", {}).get("status_code", -1) != 0:
|
||||
msg = data.get("base_resp", {}).get("status_msg", "Unknown error")
|
||||
raise Exception(f"MiniMax TTS async query error: {msg}")
|
||||
return data.get("data", {})
|
||||
|
||||
# ==================== 语音克隆 ====================
|
||||
|
||||
async def clone_voice(
|
||||
self,
|
||||
audio_url: str,
|
||||
voice_name: str,
|
||||
sample_audio_url: str | None = None,
|
||||
model: str = "speech-2.8-hd",
|
||||
) -> dict[str, Any]:
|
||||
"""
|
||||
提交语音克隆任务
|
||||
|
||||
POST /v1/voice_clone
|
||||
"""
|
||||
url = f"{self.base_url}/v1/voice_clone"
|
||||
|
||||
payload: dict[str, Any] = {
|
||||
"model": model,
|
||||
"voice_name": voice_name,
|
||||
"audio_url": audio_url,
|
||||
}
|
||||
if sample_audio_url:
|
||||
payload["sample_audio_url"] = sample_audio_url
|
||||
|
||||
async with aiohttp.ClientSession() as session:
|
||||
async with session.post(url, json=payload, headers=self._get_headers()) as resp:
|
||||
data = await resp.json()
|
||||
if data.get("base_resp", {}).get("status_code", -1) != 0:
|
||||
msg = data.get("base_resp", {}).get("status_msg", "Unknown error")
|
||||
raise Exception(f"MiniMax clone error: {msg}")
|
||||
return data.get("data", {})
|
||||
|
||||
async def query_clone_task(self, task_id: str) -> dict[str, Any]:
|
||||
"""
|
||||
查询语音克隆任务状态
|
||||
|
||||
GET /v1/voice_clone?task_id={task_id}
|
||||
"""
|
||||
url = f"{self.base_url}/v1/voice_clone?task_id={task_id}"
|
||||
|
||||
async with aiohttp.ClientSession() as session:
|
||||
async with session.get(url, headers=self._get_headers()) as resp:
|
||||
data = await resp.json()
|
||||
if data.get("base_resp", {}).get("status_code", -1) != 0:
|
||||
msg = data.get("base_resp", {}).get("status_msg", "Unknown error")
|
||||
raise Exception(f"MiniMax clone query error: {msg}")
|
||||
return data.get("data", {})
|
||||
|
||||
# ==================== 文件上传 ====================
|
||||
|
||||
async def upload_file(self, file_bytes: bytes, file_name: str, mime_type: str) -> dict[str, Any]:
|
||||
"""
|
||||
上传文件到 MiniMax
|
||||
|
||||
POST /v1/files/upload
|
||||
"""
|
||||
url = f"{self.base_url}/v1/files/upload"
|
||||
|
||||
headers = {"Authorization": f"Bearer {self.api_key}"}
|
||||
data = aiohttp.FormData()
|
||||
data.add_field("file", file_bytes, filename=file_name, content_type=mime_type)
|
||||
|
||||
async with aiohttp.ClientSession() as session:
|
||||
async with session.post(url, data=data, headers=headers) as resp:
|
||||
data = await resp.json()
|
||||
if data.get("base_resp", {}).get("status_code", -1) != 0:
|
||||
msg = data.get("base_resp", {}).get("status_msg", "Unknown error")
|
||||
raise Exception(f"MiniMax upload error: {msg}")
|
||||
return data.get("data", {})
|
||||
|
||||
# ==================== 预设音色列表 ====================
|
||||
|
||||
async def list_preset_voices(self) -> list[dict[str, Any]]:
|
||||
"""
|
||||
查询官方预设音色列表
|
||||
|
||||
GET /v1/voices/preset
|
||||
"""
|
||||
url = f"{self.base_url}/v1/voices/preset"
|
||||
|
||||
async with aiohttp.ClientSession() as session:
|
||||
async with session.get(url, headers=self._get_headers()) as resp:
|
||||
data = await resp.json()
|
||||
if data.get("base_resp", {}).get("status_code", -1) != 0:
|
||||
# 如果接口不存在,返回空列表
|
||||
logger.warning(f"MiniMax list_preset_voices failed: {data}")
|
||||
return []
|
||||
return data.get("data", {}).get("voices", [])
|
||||
@@ -8,6 +8,7 @@ from fastapi import APIRouter
|
||||
from app.api.v1 import (
|
||||
auth,
|
||||
caption,
|
||||
script,
|
||||
system,
|
||||
tasks,
|
||||
voice,
|
||||
@@ -24,6 +25,9 @@ api_router.include_router(system.router, prefix="/system", tags=["System"])
|
||||
# 任务管理模块
|
||||
api_router.include_router(tasks.router, prefix="/tasks", tags=["Tasks"])
|
||||
|
||||
# 脚本模块(生成 / 润色)
|
||||
api_router.include_router(script.router, prefix="/script", tags=["Script"])
|
||||
|
||||
# 字幕生成模块(火山引擎-豆包语音)
|
||||
api_router.include_router(caption.router, tags=["Caption"])
|
||||
|
||||
|
||||
@@ -0,0 +1,271 @@
|
||||
"""
|
||||
MiniMax TTS 语音合成服务
|
||||
==========================
|
||||
|
||||
提供语音合成、克隆的业务层封装,与现有 TTSService 接口对齐。
|
||||
|
||||
功能:
|
||||
1. 同步 TTS(短文本 ≤10000 字符)
|
||||
2. 异步长文本 TTS(大文本 ≤100万字符)
|
||||
3. 语音克隆(上传音频 → 获取 voice_id)
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
|
||||
from app.ai.providers.minimax_provider import MiniMaxProvider
|
||||
from app.config import get_settings
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# MiniMax 系统预设音色(中文常用)
|
||||
MINIMAX_PRESET_VOICES = [
|
||||
{
|
||||
"voice_id": "junlang_nanyou",
|
||||
"name": "俊朗男友",
|
||||
"language": "zh",
|
||||
"description": "成熟稳重,温暖亲切",
|
||||
"recommended": False,
|
||||
"previewUrl": "https://media.liche.cn/meijiaka-zj/audios/junlang_nanyou.mp3",
|
||||
},
|
||||
{
|
||||
"voice_id": "Chinese (Mandarin)_Radio_Host",
|
||||
"name": "电台男主播",
|
||||
"language": "zh",
|
||||
"description": "专业播报,清晰有力",
|
||||
"recommended": False,
|
||||
"previewUrl": "https://media.liche.cn/meijiaka-zj/audios/Radio_Host.mp3",
|
||||
},
|
||||
{
|
||||
"voice_id": "Chinese (Mandarin)_Lyrical_Voice",
|
||||
"name": "抒情男声",
|
||||
"language": "zh",
|
||||
"description": "深情款款,富有感染力",
|
||||
"recommended": False,
|
||||
"previewUrl": "https://media.liche.cn/meijiaka-zj/audios/Lyrical_Voice.mp3",
|
||||
},
|
||||
{
|
||||
"voice_id": "tianxin_xiaoling",
|
||||
"name": "甜心小玲",
|
||||
"language": "zh",
|
||||
"description": "甜美可爱,活泼俏皮",
|
||||
"recommended": True,
|
||||
"previewUrl": "https://media.liche.cn/meijiaka-zj/audios/tianxin_xiaoling.mp3",
|
||||
},
|
||||
{
|
||||
"voice_id": "Chinese (Mandarin)_Gentle_Senior",
|
||||
"name": "温柔学姐",
|
||||
"language": "zh",
|
||||
"description": "温柔知性,娓娓道来",
|
||||
"recommended": False,
|
||||
"previewUrl": "https://media.liche.cn/meijiaka-zj/audios/Gentle_Senior.mp3",
|
||||
},
|
||||
{
|
||||
"voice_id": "Chinese (Mandarin)_Warm_Girl",
|
||||
"name": "温暖少女",
|
||||
"language": "zh",
|
||||
"description": "轻柔细腻,清新自然",
|
||||
"recommended": False,
|
||||
"previewUrl": "https://media.liche.cn/meijiaka-zj/audios/Warm_Girl.mp3",
|
||||
},
|
||||
]
|
||||
|
||||
# 默认音色:甜心小玲
|
||||
DEFAULT_VOICE_ID = "tianxin_xiaoling"
|
||||
|
||||
|
||||
class MiniMaxTTSService:
|
||||
"""MiniMax TTS 服务封装"""
|
||||
|
||||
default_voice_id: str = DEFAULT_VOICE_ID
|
||||
|
||||
def __init__(self) -> None:
|
||||
settings = get_settings()
|
||||
self.provider = MiniMaxProvider(
|
||||
api_key=settings.MINIMAX_API_KEY,
|
||||
base_url=settings.MINIMAX_BASE_URL,
|
||||
)
|
||||
|
||||
# ==================== 同步 TTS ====================
|
||||
|
||||
async def synthesize_sync(
|
||||
self,
|
||||
text: str,
|
||||
voice_id: str | None = None,
|
||||
speed: float = 1.0,
|
||||
**kwargs,
|
||||
) -> str:
|
||||
"""
|
||||
同步语音合成,返回音频 URL。
|
||||
|
||||
Args:
|
||||
text: 待合成文本(≤10000 字符)
|
||||
voice_id: 音色 ID(默认:甜心小玲)
|
||||
speed: 语速(0.8-2.0)
|
||||
|
||||
Returns:
|
||||
音频 URL(有效期 24 小时)
|
||||
"""
|
||||
if not text or not text.strip():
|
||||
raise ValueError("text 不能为空")
|
||||
|
||||
voice = voice_id or self.default_voice_id
|
||||
|
||||
result = await self.provider.tts_sync(
|
||||
text=text,
|
||||
voice_id=voice,
|
||||
speed=speed,
|
||||
output_format="url",
|
||||
**kwargs,
|
||||
)
|
||||
|
||||
audio_url = result.get("audio") or result.get("audio_url")
|
||||
if not audio_url:
|
||||
raise ValueError("TTS 合成失败: 未返回音频 URL")
|
||||
|
||||
logger.info(f"[MiniMax TTS] 合成成功: voice_id={voice}, url={audio_url[:60]}...")
|
||||
return audio_url
|
||||
|
||||
# ==================== 异步长文本 TTS ====================
|
||||
|
||||
async def synthesize_async_create(
|
||||
self,
|
||||
text: str,
|
||||
voice_id: str | None = None,
|
||||
speed: float = 1.0,
|
||||
**kwargs,
|
||||
) -> str:
|
||||
"""
|
||||
创建异步长文本 TTS 任务,返回 task_id。
|
||||
|
||||
Args:
|
||||
text: 待合成文本(≤100万字符)
|
||||
voice_id: 音色 ID
|
||||
speed: 语速
|
||||
|
||||
Returns:
|
||||
task_id
|
||||
"""
|
||||
if not text or not text.strip():
|
||||
raise ValueError("text 不能为空")
|
||||
|
||||
voice = voice_id or self.default_voice_id
|
||||
|
||||
result = await self.provider.tts_async_create(
|
||||
text=text,
|
||||
voice_id=voice,
|
||||
speed=speed,
|
||||
**kwargs,
|
||||
)
|
||||
|
||||
task_id = result.get("task_id")
|
||||
if not task_id:
|
||||
raise ValueError("异步 TTS 任务创建失败: 未返回 task_id")
|
||||
|
||||
logger.info(f"[MiniMax TTS Async] 任务创建成功: task_id={task_id}")
|
||||
return task_id
|
||||
|
||||
async def query_async_task(self, task_id: str) -> dict:
|
||||
"""
|
||||
查询异步 TTS 任务状态。
|
||||
|
||||
Returns:
|
||||
{
|
||||
"status": "Queueing" | "Processing" | "Success" | "Fail",
|
||||
"audio_url": "...", # Success 时有
|
||||
"file_id": "...", # Success 时有
|
||||
"duration": 123.45, # Success 时有(秒)
|
||||
"error_msg": "...", # Fail 时有
|
||||
}
|
||||
"""
|
||||
result = await self.provider.tts_async_query(task_id)
|
||||
status = result.get("status", "Queueing")
|
||||
|
||||
ret = {
|
||||
"status": status,
|
||||
"task_id": task_id,
|
||||
}
|
||||
|
||||
if status == "Success":
|
||||
ret["audio_url"] = result.get("audio_url")
|
||||
ret["file_id"] = result.get("file_id")
|
||||
ret["duration"] = result.get("duration")
|
||||
elif status == "Fail":
|
||||
ret["error_msg"] = result.get("error_msg", "任务失败")
|
||||
|
||||
return ret
|
||||
|
||||
# ==================== 语音克隆 ====================
|
||||
|
||||
async def clone_voice(
|
||||
self,
|
||||
audio_url: str,
|
||||
voice_name: str,
|
||||
sample_audio_url: str | None = None,
|
||||
) -> str:
|
||||
"""
|
||||
提交语音克隆任务,返回 task_id。
|
||||
|
||||
Args:
|
||||
audio_url: 目标克隆音频 URL(5-30秒,公开可访问)
|
||||
voice_name: 音色名称(≤20字符)
|
||||
sample_audio_url: 可选,示例音频 URL 提升克隆质量
|
||||
|
||||
Returns:
|
||||
task_id
|
||||
"""
|
||||
result = await self.provider.clone_voice(
|
||||
audio_url=audio_url,
|
||||
voice_name=voice_name,
|
||||
sample_audio_url=sample_audio_url,
|
||||
)
|
||||
task_id = result.get("task_id")
|
||||
if not task_id:
|
||||
raise ValueError("克隆任务提交失败: 未返回 task_id")
|
||||
logger.info(f"[MiniMax Clone] 提交成功: task_id={task_id}")
|
||||
return task_id
|
||||
|
||||
async def query_clone_task(self, task_id: str) -> dict:
|
||||
"""
|
||||
查询语音克隆任务状态。
|
||||
|
||||
Returns:
|
||||
{
|
||||
"status": "Queueing" | "Processing" | "Success" | "Fail",
|
||||
"voice_id": "...", # Success 时有
|
||||
"trial_url": "...", # Success 时有
|
||||
"error_msg": "...", # Fail 时有
|
||||
}
|
||||
"""
|
||||
result = await self.provider.query_clone_task(task_id)
|
||||
status = result.get("status", "Queueing")
|
||||
|
||||
ret = {
|
||||
"status": status,
|
||||
"task_id": task_id,
|
||||
}
|
||||
|
||||
if status == "Success":
|
||||
ret["voice_id"] = result.get("voice_id")
|
||||
ret["trial_url"] = result.get("trial_url")
|
||||
elif status == "Fail":
|
||||
ret["error_msg"] = result.get("error_msg", "克隆失败")
|
||||
|
||||
return ret
|
||||
|
||||
# ==================== 预设音色 ====================
|
||||
|
||||
@staticmethod
|
||||
def get_preset_voices() -> list[dict]:
|
||||
"""获取预设音色列表"""
|
||||
return MINIMAX_PRESET_VOICES
|
||||
|
||||
@staticmethod
|
||||
def get_voice_by_id(voice_id: str) -> dict | None:
|
||||
"""根据 ID 获取音色信息"""
|
||||
for voice in MINIMAX_PRESET_VOICES:
|
||||
if voice["voice_id"] == voice_id:
|
||||
return voice
|
||||
return None
|
||||
@@ -266,7 +266,12 @@ class QiniuService:
|
||||
}
|
||||
|
||||
def upload_stream(
|
||||
self, stream: BinaryIO, key: str, mime_type: str = "application/octet-stream"
|
||||
self,
|
||||
stream: BinaryIO,
|
||||
key: str,
|
||||
mime_type: str = "application/octet-stream",
|
||||
bucket: str = None,
|
||||
domain: str = None,
|
||||
) -> dict:
|
||||
"""
|
||||
上传文件流到七牛云
|
||||
@@ -275,20 +280,35 @@ class QiniuService:
|
||||
stream: 文件流对象
|
||||
key: 文件存储 Key
|
||||
mime_type: 文件 MIME 类型
|
||||
bucket: 存储空间名称(默认 video_bucket)
|
||||
domain: 加速域名(默认 video_domain)
|
||||
|
||||
Returns:
|
||||
上传结果字典
|
||||
"""
|
||||
token = self.get_upload_token(key)
|
||||
bucket = bucket or self.video_bucket
|
||||
domain = domain or self.video_domain
|
||||
token = self.get_upload_token(bucket, key)
|
||||
|
||||
# 获取流大小并重置指针到开头
|
||||
stream.seek(0, 2)
|
||||
data_size = stream.tell()
|
||||
stream.seek(0)
|
||||
|
||||
ret, info = put_stream(
|
||||
up_token=token, key=key, data_stream=stream, params=None, mime_type=mime_type
|
||||
up_token=token,
|
||||
key=key,
|
||||
input_stream=stream,
|
||||
file_name=key,
|
||||
data_size=data_size,
|
||||
params=None,
|
||||
mime_type=mime_type,
|
||||
)
|
||||
|
||||
if ret is None:
|
||||
raise Exception(f"上传失败: {info}")
|
||||
|
||||
return {"key": ret["key"], "hash": ret["hash"], "url": self.get_file_url(key)}
|
||||
return {"key": ret["key"], "hash": ret["hash"], "url": self.get_file_url(domain, key)}
|
||||
|
||||
def upload_audio(self, local_path: str, user_id: str = None, key: str = None) -> dict:
|
||||
"""
|
||||
|
||||
@@ -12,6 +12,7 @@ from pathlib import Path
|
||||
|
||||
from app.ai.providers.klingai_provider import KlingAIProvider
|
||||
from app.config import get_settings
|
||||
from app.services.qiniu_service import get_qiniu_service
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -31,42 +32,60 @@ def _get_kling_provider() -> KlingAIProvider:
|
||||
|
||||
|
||||
class TTSService:
|
||||
"""Kling AI TTS 服务客户端"""
|
||||
"""
|
||||
Kling AI TTS 服务客户端
|
||||
|
||||
# Kling 官方预设音色(已知音色)
|
||||
⚠️ 已废弃:语音合成功能已迁移至 MiniMaxTTSService
|
||||
保留此文件仅用于历史兼容,新代码请使用 MiniMaxTTSService
|
||||
"""
|
||||
|
||||
# Kling 官方预设音色(已废弃,仅视频生成场景仍可能使用)
|
||||
|
||||
# Kling 官方预设音色
|
||||
PRESET_VOICES = [
|
||||
{
|
||||
"voice_id": "829824295735410756",
|
||||
"voice_id": "ai_shatang",
|
||||
"name": "钓系女友",
|
||||
"language": "zh",
|
||||
"description": "甜美撒娇",
|
||||
"previewUrl": "https://media.liche.cn/meijiaka-zj/audios/ai_shatang.mp3",
|
||||
"recommended": False,
|
||||
},
|
||||
{
|
||||
"voice_id": "829826751244537879",
|
||||
"voice_id": "chat1_female_new-3",
|
||||
"name": "温柔女声",
|
||||
"language": "zh",
|
||||
"description": "温柔细腻",
|
||||
"previewUrl": "https://media.liche.cn/meijiaka-zj/audios/chat1_female_new-3.mp3",
|
||||
"recommended": True,
|
||||
},
|
||||
{
|
||||
"voice_id": "829826792415842333",
|
||||
"voice_id": "yizhipiannan-v1",
|
||||
"name": "播报男声",
|
||||
"language": "zh",
|
||||
"description": "沉稳播报",
|
||||
"previewUrl": "https://media.liche.cn/meijiaka-zj/audios/yizhipiannan-v1.mp3",
|
||||
"recommended": False,
|
||||
},
|
||||
{
|
||||
"voice_id": "829826834144964676",
|
||||
"voice_id": "tiexin_nanyou",
|
||||
"name": "盐系少年",
|
||||
"language": "zh",
|
||||
"description": "清新少年",
|
||||
"previewUrl": "https://media.liche.cn/meijiaka-zj/audios/tiexin_nanyou.mp3",
|
||||
"recommended": False,
|
||||
},
|
||||
{
|
||||
"voice_id": "829826884271091753",
|
||||
"voice_id": "girlfriend_1_speech02",
|
||||
"name": "撒娇女友",
|
||||
"language": "zh",
|
||||
"description": "可爱撒娇",
|
||||
"previewUrl": "https://media.liche.cn/meijiaka-zj/audios/girlfriend_1_speech02.mp3",
|
||||
"recommended": False,
|
||||
},
|
||||
]
|
||||
|
||||
|
||||
def __init__(self) -> None:
|
||||
self.provider = _get_kling_provider()
|
||||
self.default_voice_id = "829826751244537879" # 温柔女声
|
||||
@@ -77,6 +96,8 @@ class TTSService:
|
||||
voice_id: str | None = None,
|
||||
speed: float = 1.0,
|
||||
voice_language: str = "zh",
|
||||
volume: float = 1.0,
|
||||
pitch: int = 0,
|
||||
) -> str:
|
||||
"""
|
||||
同步合成语音(提交任务并等待完成),返回音频 URL。
|
||||
@@ -86,6 +107,8 @@ class TTSService:
|
||||
voice_id: 音色 ID(默认使用温柔女声)
|
||||
speed: 语速 (0.8-2.0)
|
||||
voice_language: 语言 (zh/en)
|
||||
volume: 音量 (0.5-10.0)
|
||||
pitch: 音调 (-10 到 10)
|
||||
|
||||
Returns:
|
||||
音频 URL
|
||||
@@ -108,6 +131,8 @@ class TTSService:
|
||||
voice_id=voice,
|
||||
voice_language=voice_language,
|
||||
voice_speed=speed,
|
||||
voice_volume=volume,
|
||||
voice_pitch=pitch,
|
||||
)
|
||||
|
||||
task_id = result.get("task_id")
|
||||
@@ -116,11 +141,28 @@ class TTSService:
|
||||
|
||||
logger.info(f"[TTS] 任务已提交: task_id={task_id}")
|
||||
|
||||
# 先检查提交返回的结果,如果已完成直接返回
|
||||
submit_status = result.get("task_status", "")
|
||||
if submit_status == "succeed":
|
||||
audio_url = self._extract_audio_url(result)
|
||||
if audio_url:
|
||||
return audio_url
|
||||
|
||||
# 等待任务完成
|
||||
audio_url = await self._wait_for_task(task_id)
|
||||
|
||||
return audio_url
|
||||
|
||||
def _extract_audio_url(self, result: dict) -> str | None:
|
||||
"""从 Kling TTS 响应中提取音频 URL"""
|
||||
task_result = result.get("task_result", {})
|
||||
if isinstance(task_result, dict):
|
||||
audios = task_result.get("audios", [])
|
||||
if audios and isinstance(audios, list):
|
||||
return audios[0].get("url")
|
||||
# 兜底:某些响应格式直接放在顶层
|
||||
return result.get("audio_url")
|
||||
|
||||
async def _wait_for_task(self, task_id: str) -> str:
|
||||
"""等待 TTS 任务完成并返回音频 URL"""
|
||||
elapsed = 0.0
|
||||
@@ -129,21 +171,18 @@ class TTSService:
|
||||
elapsed += TTS_POLL_INTERVAL
|
||||
|
||||
result = await self.provider.get_tts_task(task_id)
|
||||
status = result.get("status") or result.get("task_status", "")
|
||||
status = result.get("task_status", "")
|
||||
|
||||
logger.debug(f"[TTS] task_id={task_id}, status={status}, elapsed={elapsed}s")
|
||||
|
||||
if status == "succeed":
|
||||
# 任务成功,返回音频 URL
|
||||
task_result = result.get("task_result", {})
|
||||
audio_url = task_result.get("audio_url") if isinstance(task_result, dict) else None
|
||||
audio_url = self._extract_audio_url(result)
|
||||
if audio_url:
|
||||
return audio_url
|
||||
# 某些响应格式直接放在 data 中
|
||||
return result.get("audio_url") or result.get("data", {}).get("audio_url", "")
|
||||
raise ValueError("TTS 任务成功但未返回音频 URL")
|
||||
|
||||
if status in ("failed", "error"):
|
||||
raise ValueError(f"TTS 任务失败: {result.get('message', '未知错误')}")
|
||||
raise ValueError(f"TTS 任务失败: {result.get('task_status_msg', '未知错误')}")
|
||||
|
||||
raise TimeoutError(f"TTS 任务等待超时({TTS_TASK_TIMEOUT}秒)")
|
||||
|
||||
@@ -154,6 +193,8 @@ class TTSService:
|
||||
voice_id: str | None = None,
|
||||
speed: float = 1.0,
|
||||
voice_language: str = "zh",
|
||||
volume: float = 1.0,
|
||||
pitch: int = 0,
|
||||
) -> Path:
|
||||
"""
|
||||
合成语音并保存到文件。
|
||||
@@ -164,6 +205,8 @@ class TTSService:
|
||||
voice_id: 音色 ID
|
||||
speed: 语速
|
||||
voice_language: 语言
|
||||
volume: 音量 (0.5-10.0)
|
||||
pitch: 音调 (-10 到 10)
|
||||
|
||||
Returns:
|
||||
输出文件路径
|
||||
@@ -179,6 +222,8 @@ class TTSService:
|
||||
voice_id=voice_id,
|
||||
speed=speed,
|
||||
voice_language=voice_language,
|
||||
volume=volume,
|
||||
pitch=pitch,
|
||||
)
|
||||
|
||||
# 下载音频并保存
|
||||
@@ -197,6 +242,8 @@ class TTSService:
|
||||
output_dir: str | Path,
|
||||
voice_id: str | None = None,
|
||||
speed: float = 1.0,
|
||||
volume: float = 1.0,
|
||||
pitch: int = 0,
|
||||
) -> list[dict]:
|
||||
"""
|
||||
批量合成多段语音。
|
||||
@@ -206,6 +253,8 @@ class TTSService:
|
||||
output_dir: 输出目录
|
||||
voice_id: 音色 ID
|
||||
speed: 语速
|
||||
volume: 音量 (0.5-10.0)
|
||||
pitch: 音调 (-10 到 10)
|
||||
|
||||
Returns:
|
||||
结果列表,每项包含 input(原始输入)和 output(输出文件路径或错误信息)
|
||||
@@ -225,6 +274,8 @@ class TTSService:
|
||||
output_path=output_dir / filename,
|
||||
voice_id=voice_id,
|
||||
speed=speed,
|
||||
volume=volume,
|
||||
pitch=pitch,
|
||||
)
|
||||
results.append({
|
||||
"index": index,
|
||||
@@ -247,7 +298,10 @@ class TTSService:
|
||||
|
||||
@staticmethod
|
||||
def get_preset_voices() -> list[dict]:
|
||||
"""获取预设音色列表"""
|
||||
"""获取预设音色列表
|
||||
|
||||
返回预先生成并上传到七牛云的试听音频 URL
|
||||
"""
|
||||
return TTSService.PRESET_VOICES
|
||||
|
||||
@staticmethod
|
||||
|
||||
@@ -41,7 +41,12 @@ class CloneTaskStatus(Enum):
|
||||
|
||||
|
||||
class VoiceCloneService:
|
||||
"""Kling AI 声音克隆服务客户端"""
|
||||
"""
|
||||
Kling AI 声音克隆服务客户端
|
||||
|
||||
⚠️ 已废弃:语音克隆功能已迁移至 MiniMaxTTSService
|
||||
保留此文件仅用于历史兼容,新代码请使用 MiniMaxTTSService
|
||||
"""
|
||||
|
||||
def __init__(self) -> None:
|
||||
self.provider = _get_kling_provider()
|
||||
|
||||
@@ -0,0 +1,83 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
生成预设音色试听音频并上传到七牛云
|
||||
================================================
|
||||
|
||||
这个脚本会为所有预设音色生成试听音频(文案:"您好,我是您的家装顾问。需要我为您生成几版效果图看看吗?"),
|
||||
然后自动上传到七牛云存储。
|
||||
|
||||
运行方式:
|
||||
cd python-api
|
||||
python scripts/generate_preset_voice_previews.py
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
|
||||
from app.config import get_settings
|
||||
from app.services.tts_service import TTSService
|
||||
from app.services.qiniu_service import get_qiniu_service
|
||||
|
||||
# 试听文案
|
||||
PREVIEW_TEXT = "您好,我是您的家装顾问。需要我为您生成几版效果图看看吗?"
|
||||
|
||||
|
||||
async def generate_all_previews():
|
||||
"""为所有预设音色生成试听音频并上传"""
|
||||
settings = get_settings()
|
||||
tts_service = TTSService()
|
||||
qiniu = get_qiniu_service()
|
||||
|
||||
# 获取所有预设音色
|
||||
preset_voices = TTSService.PRESET_VOICES
|
||||
print(f"开始为 {len(preset_voices)} 个预设音色生成试听音频...\n")
|
||||
|
||||
for idx, voice in enumerate(preset_voices, 1):
|
||||
voice_id = voice["voice_id"]
|
||||
name = voice["name"]
|
||||
preview_key = voice["previewKey"]
|
||||
print(f"[{idx}/{len(preset_voices)}] 正在生成: {name} ({voice_id})")
|
||||
|
||||
# 创建临时文件
|
||||
with tempfile.NamedTemporaryFile(suffix='.mp3', delete=False) as f:
|
||||
temp_path = Path(f.name)
|
||||
|
||||
try:
|
||||
# 生成音频
|
||||
output_path = await tts_service.synthesize_to_file(
|
||||
text=PREVIEW_TEXT,
|
||||
output_path=temp_path,
|
||||
voice_id=voice_id,
|
||||
speed=1.0,
|
||||
voice_language=voice.get("language", "zh"),
|
||||
)
|
||||
|
||||
print(f" ✓ 生成完成,正在上传到七牛云...")
|
||||
|
||||
# 上传到七牛云
|
||||
result = qiniu.upload_file(
|
||||
local_path=str(output_path),
|
||||
key=preview_key,
|
||||
file_type="audio",
|
||||
check_duplicate=False, # 强制覆盖
|
||||
)
|
||||
|
||||
url = result["url"]
|
||||
print(f" ✓ 上传成功: {url}")
|
||||
|
||||
except Exception as e:
|
||||
print(f" ✗ 失败: {str(e)}")
|
||||
|
||||
finally:
|
||||
# 清理临时文件
|
||||
if temp_path.exists():
|
||||
temp_path.unlink()
|
||||
|
||||
print()
|
||||
|
||||
print("\n全部完成!")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(generate_all_previews())
|
||||
@@ -0,0 +1,128 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
上传官方预置音色试听音频到七牛云
|
||||
================================================
|
||||
|
||||
KlingAI 已经为每个官方预置音色提供了 trial_url,我们直接下载这个音频
|
||||
然后上传到七牛云存储,获取永久链接。
|
||||
|
||||
运行方式:
|
||||
cd python-api
|
||||
python scripts/upload_preset_voice_previews.py
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
|
||||
import httpx
|
||||
|
||||
from app.config import get_settings
|
||||
from app.services.qiniu_service import get_qiniu_service
|
||||
from app.ai.providers.klingai_provider import KlingAIProvider
|
||||
|
||||
|
||||
async def download_file(url: str, temp_path: Path) -> None:
|
||||
"""下载文件到本地"""
|
||||
async with httpx.AsyncClient(timeout=60.0) as client:
|
||||
response = await client.get(url)
|
||||
response.raise_for_status()
|
||||
temp_path.write_bytes(response.content)
|
||||
|
||||
|
||||
async def upload_all_previews():
|
||||
"""下载所有官方预置音色试听并上传到七牛云"""
|
||||
settings = get_settings()
|
||||
qiniu = get_qiniu_service()
|
||||
provider = KlingAIProvider({
|
||||
"access_key": settings.KLINGAI_ACCESS_KEY or "",
|
||||
"secret_key": settings.KLINGAI_SECRET_KEY or "",
|
||||
})
|
||||
|
||||
# 获取官方预置音色列表
|
||||
voices = await provider.list_preset_voices()
|
||||
print(f"获取到 {len(voices)} 个官方预置音色\n")
|
||||
|
||||
description_map = {
|
||||
"钓系女友": "甜美撒娇",
|
||||
"温柔女声": "温柔细腻",
|
||||
"播报男声": "沉稳播报",
|
||||
"盐系少年": "清新少年",
|
||||
"撒娇女友": "可爱撒娇",
|
||||
}
|
||||
|
||||
results = []
|
||||
|
||||
for idx, voice in enumerate(voices, 1):
|
||||
if voice.get("status") != "succeed":
|
||||
print(f"[{idx}] 跳过 - 状态不为 succeed: {voice.get('status')}")
|
||||
continue
|
||||
|
||||
voice_id = voice["voice_id"]
|
||||
voice_name = voice["voice_name"]
|
||||
trial_url = voice.get("trial_url")
|
||||
|
||||
if not trial_url:
|
||||
print(f"[{idx}] {voice_name} - 没有 trial_url,跳过")
|
||||
continue
|
||||
|
||||
print(f"[{idx}/{len(voices)}] 处理: {voice_name} ({voice_id})")
|
||||
print(f" 原地址: {trial_url}")
|
||||
|
||||
# 下载到临时文件
|
||||
ext = ".wav"
|
||||
with tempfile.NamedTemporaryFile(suffix=ext, delete=False) as f:
|
||||
temp_path = Path(f.name)
|
||||
|
||||
try:
|
||||
await download_file(trial_url, temp_path)
|
||||
print(f" ✓ 下载完成 ({temp_path.stat().st_size / 1024:.1f} KB)")
|
||||
|
||||
# 上传到七牛云
|
||||
key = f"meijiaka-zj/audios/{voice_id}{ext}"
|
||||
result = qiniu.upload_file(
|
||||
local_path=str(temp_path),
|
||||
key=key,
|
||||
file_type="audio",
|
||||
check_duplicate=False,
|
||||
)
|
||||
final_url = result["url"]
|
||||
print(f" ✓ 上传成功: {final_url}")
|
||||
|
||||
results.append({
|
||||
"voice_id": voice_id,
|
||||
"name": voice_name,
|
||||
"description": description_map.get(voice_name, ""),
|
||||
"previewUrl": final_url,
|
||||
"language": "zh",
|
||||
"recommended": voice_name == "温柔女声",
|
||||
})
|
||||
|
||||
except Exception as e:
|
||||
print(f" ✗ 失败: {str(e)}")
|
||||
|
||||
finally:
|
||||
# 清理临时文件
|
||||
if temp_path.exists():
|
||||
temp_path.unlink()
|
||||
|
||||
print()
|
||||
|
||||
print("\n=== 最终结果 ===")
|
||||
print("复制以下内容到 TTSService.PRESET_VOICES:")
|
||||
print()
|
||||
for r in results:
|
||||
print(f" {{")
|
||||
print(f" \"voice_id\": \"{r['voice_id']}\",")
|
||||
print(f" \"name\": \"{r['name']}\",")
|
||||
print(f" \"language\": \"{r['language']}\",")
|
||||
print(f" \"description\": \"{r['description']}\",")
|
||||
print(f" \"previewUrl\": \"{r['previewUrl']}\",")
|
||||
print(f" \"recommended\": {str(r['recommended']).lower()},")
|
||||
print(f" }},")
|
||||
|
||||
return results
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(upload_all_previews())
|
||||
@@ -3,6 +3,86 @@
|
||||
use crate::ApiResponse;
|
||||
use crate::storage::voice as voice_storage;
|
||||
|
||||
// --------------------- 音色素材库命令 ---------------------
|
||||
|
||||
#[derive(serde::Deserialize)]
|
||||
#[serde(rename_all = "camelCase")]
|
||||
pub struct VoiceMaterialArgs {
|
||||
pub id: String,
|
||||
pub name: String,
|
||||
pub voice_id: String,
|
||||
pub source_url: String,
|
||||
pub trial_url: Option<String>,
|
||||
pub status: String,
|
||||
pub created_at: String,
|
||||
}
|
||||
|
||||
/// 加载音色素材库
|
||||
#[tauri::command]
|
||||
pub async fn load_voice_materials() -> ApiResponse<Vec<voice_storage::VoiceMaterial>> {
|
||||
match voice_storage::load_voice_materials() {
|
||||
Ok(list) => ApiResponse {
|
||||
code: 200,
|
||||
message: "素材库加载成功".to_string(),
|
||||
data: Some(list.materials),
|
||||
},
|
||||
Err(e) => ApiResponse {
|
||||
code: 500,
|
||||
message: format!("加载素材库失败: {}", e),
|
||||
data: Some(vec![]),
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
/// 保存音色素材
|
||||
#[tauri::command]
|
||||
pub async fn save_voice_material(
|
||||
args: VoiceMaterialArgs,
|
||||
) -> ApiResponse<bool> {
|
||||
let material = voice_storage::VoiceMaterial {
|
||||
id: args.id,
|
||||
name: args.name,
|
||||
voice_id: args.voice_id,
|
||||
source_url: args.source_url,
|
||||
trial_url: args.trial_url,
|
||||
status: args.status,
|
||||
created_at: args.created_at,
|
||||
};
|
||||
match voice_storage::add_voice_material(material) {
|
||||
Ok(_) => ApiResponse {
|
||||
code: 200,
|
||||
message: "素材保存成功".to_string(),
|
||||
data: Some(true),
|
||||
},
|
||||
Err(e) => ApiResponse {
|
||||
code: 500,
|
||||
message: format!("保存素材失败: {}", e),
|
||||
data: Some(false),
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
/// 删除音色素材
|
||||
#[tauri::command]
|
||||
pub async fn delete_voice_material_cmd(
|
||||
id: String,
|
||||
) -> ApiResponse<bool> {
|
||||
match voice_storage::delete_voice_material(&id) {
|
||||
Ok(_) => ApiResponse {
|
||||
code: 200,
|
||||
message: "素材删除成功".to_string(),
|
||||
data: Some(true),
|
||||
},
|
||||
Err(e) => ApiResponse {
|
||||
code: 500,
|
||||
message: format!("删除素材失败: {}", e),
|
||||
data: Some(false),
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
// --------------------- 音频文件命令 ---------------------
|
||||
|
||||
#[derive(serde::Deserialize)]
|
||||
#[serde(rename_all = "camelCase")]
|
||||
pub struct SaveAudioArgs {
|
||||
|
||||
@@ -114,6 +114,10 @@ pub fn run() {
|
||||
commands::voice::list_project_audios,
|
||||
commands::voice::delete_audio,
|
||||
commands::voice::get_project_audios_dir,
|
||||
// 音色素材库
|
||||
commands::voice::load_voice_materials,
|
||||
commands::voice::save_voice_material,
|
||||
commands::voice::delete_voice_material_cmd,
|
||||
// 音频处理
|
||||
replace_audio_track,
|
||||
mix_audio_tracks,
|
||||
|
||||
@@ -77,6 +77,13 @@ pub fn get_avatars_json_path() -> Result<PathBuf, StorageError> {
|
||||
Ok(base.join("avatars.json"))
|
||||
}
|
||||
|
||||
/// 获取私有音色素材库 JSON 路径
|
||||
/// ~/Documents/Meijiaka-zj/voices.json
|
||||
pub fn get_voices_json_path() -> Result<PathBuf, StorageError> {
|
||||
let base = get_meijiaka_dir()?;
|
||||
Ok(base.join("voices.json"))
|
||||
}
|
||||
|
||||
/// 获取认证状态文件路径
|
||||
/// {app_config_dir}/auth.json
|
||||
pub fn get_auth_state_path(app: &AppHandle) -> Result<PathBuf, StorageError> {
|
||||
|
||||
@@ -6,7 +6,7 @@ use serde::{Deserialize, Serialize};
|
||||
use std::path::{Path, PathBuf};
|
||||
|
||||
use crate::storage::engine::{atomic_write_bytes, atomic_write_json, read_json, ensure_dir, StorageError};
|
||||
use crate::storage::paths::get_project_dir;
|
||||
use crate::storage::paths::{get_project_dir, get_voices_json_path};
|
||||
|
||||
/// 音频文件元数据
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
@@ -162,3 +162,90 @@ pub fn get_project_audios_dir(project_id: &str) -> Result<PathBuf, StorageError>
|
||||
fn chrono_lite_now() -> String {
|
||||
chrono::Utc::now().to_rfc3339()
|
||||
}
|
||||
|
||||
// ====================== 私有音色素材库 ======================
|
||||
|
||||
/// 音色素材记录
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
#[serde(rename_all = "camelCase")]
|
||||
pub struct VoiceMaterial {
|
||||
pub id: String,
|
||||
pub name: String,
|
||||
pub voice_id: String,
|
||||
pub source_url: String,
|
||||
pub trial_url: Option<String>,
|
||||
pub status: String, // pending / processing / ready / failed
|
||||
pub created_at: String,
|
||||
}
|
||||
|
||||
/// 音色素材库列表
|
||||
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
|
||||
#[serde(rename_all = "camelCase")]
|
||||
pub struct VoiceMaterialsList {
|
||||
pub materials: Vec<VoiceMaterial>,
|
||||
pub updated_at: String,
|
||||
}
|
||||
|
||||
/// 加载音色素材库
|
||||
pub fn load_voice_materials() -> Result<VoiceMaterialsList, StorageError> {
|
||||
let path = get_voices_json_path()?;
|
||||
Ok(read_json(&path)?.unwrap_or_default())
|
||||
}
|
||||
|
||||
/// 保存音色素材库
|
||||
pub fn save_voice_materials(list: &VoiceMaterialsList) -> Result<(), StorageError> {
|
||||
let path = get_voices_json_path()?;
|
||||
atomic_write_json(&path, list)
|
||||
}
|
||||
|
||||
/// 添加音色素材
|
||||
pub fn add_voice_material(material: VoiceMaterial) -> Result<(), StorageError> {
|
||||
let mut list = load_voice_materials()?;
|
||||
// 去重:相同 id 替换
|
||||
if let Some(pos) = list.materials.iter().position(|m| m.id == material.id) {
|
||||
list.materials[pos] = material;
|
||||
} else {
|
||||
list.materials.push(material);
|
||||
}
|
||||
list.updated_at = chrono_lite_now();
|
||||
save_voice_materials(&list)
|
||||
}
|
||||
|
||||
/// 更新音色素材状态
|
||||
pub fn update_voice_material_status(
|
||||
id: &str,
|
||||
status: &str,
|
||||
voice_id: Option<&str>,
|
||||
trial_url: Option<&str>,
|
||||
) -> Result<(), StorageError> {
|
||||
let mut list = load_voice_materials()?;
|
||||
if let Some(pos) = list.materials.iter().position(|m| m.id == id) {
|
||||
list.materials[pos].status = status.to_string();
|
||||
if let Some(vid) = voice_id {
|
||||
list.materials[pos].voice_id = vid.to_string();
|
||||
}
|
||||
if let Some(url) = trial_url {
|
||||
list.materials[pos].trial_url = Some(url.to_string());
|
||||
}
|
||||
list.updated_at = chrono_lite_now();
|
||||
save_voice_materials(&list)
|
||||
} else {
|
||||
Err(StorageError::Io(std::io::Error::new(
|
||||
std::io::ErrorKind::NotFound,
|
||||
format!("音色素材 {} 不存在", id),
|
||||
)))
|
||||
}
|
||||
}
|
||||
|
||||
/// 删除音色素材
|
||||
pub fn delete_voice_material(id: &str) -> Result<(), StorageError> {
|
||||
let mut list = load_voice_materials()?;
|
||||
let pos = list.materials.iter().position(|m| m.id == id)
|
||||
.ok_or_else(|| StorageError::Io(std::io::Error::new(
|
||||
std::io::ErrorKind::NotFound,
|
||||
format!("音色素材 {} 不存在", id),
|
||||
)))?;
|
||||
list.materials.remove(pos);
|
||||
list.updated_at = chrono_lite_now();
|
||||
save_voice_materials(&list)
|
||||
}
|
||||
|
||||
@@ -8,7 +8,7 @@ import Sidebar from './components/Layout/Sidebar';
|
||||
import Login from './pages/Login/Login';
|
||||
import VideoCreation from './pages/VideoCreation';
|
||||
import MyWorks from './pages/ContentManagement/MyWorks';
|
||||
import AvatarClone from './pages/ContentManagement/AvatarClone';
|
||||
import VoiceMaterialLibrary from './pages/ContentManagement/VoiceMaterialLibrary';
|
||||
import AboutUs from './pages/Settings/AboutUs';
|
||||
import SystemUpdate from './pages/Settings/SystemUpdate';
|
||||
import ThemeSettings from './pages/Settings/ThemeSettings';
|
||||
@@ -41,7 +41,7 @@ export const useNavigation = () => useContext(NavigationContext);
|
||||
// 页面类型
|
||||
export type PageType =
|
||||
| 'video-creation'
|
||||
| 'avatar-clone'
|
||||
| 'voice-material'
|
||||
| 'my-works'
|
||||
| 'about-us'
|
||||
| 'system-update'
|
||||
@@ -52,7 +52,7 @@ export type PageType =
|
||||
// 页面组件映射表(模块级别,避免每次渲染重建组件实例)
|
||||
const pages: Record<PageType, React.ComponentType> = {
|
||||
'video-creation': VideoCreation,
|
||||
'avatar-clone': AvatarClone,
|
||||
'voice-material': VoiceMaterialLibrary,
|
||||
'my-works': MyWorks,
|
||||
'about-us': AboutUs,
|
||||
'system-update': SystemUpdate,
|
||||
|
||||
@@ -15,6 +15,8 @@ export interface VoiceInfo {
|
||||
name: string;
|
||||
description: string;
|
||||
recommended: boolean;
|
||||
language?: string;
|
||||
previewUrl?: string;
|
||||
}
|
||||
|
||||
export interface TTSSynthesizeRequest {
|
||||
@@ -35,10 +37,12 @@ export interface TTSBatchRequest {
|
||||
segments: TTSBatchSegment[];
|
||||
voiceId?: string;
|
||||
speed?: number;
|
||||
volume?: number;
|
||||
pitch?: number;
|
||||
}
|
||||
|
||||
export interface TTSResult {
|
||||
audioBase64: string;
|
||||
audioUrl: string;
|
||||
format: string;
|
||||
text: string;
|
||||
voiceId: string;
|
||||
@@ -73,6 +77,25 @@ export interface VoiceCloneTaskResponse {
|
||||
errorMessage?: string;
|
||||
}
|
||||
|
||||
// ====================== 素材库类型 ======================
|
||||
|
||||
export interface VoiceMaterial {
|
||||
id: string;
|
||||
name: string;
|
||||
voiceId: string; // Kling 返回的音色 ID
|
||||
sourceUrl: string; // 七牛云原始音频 URL
|
||||
trialUrl?: string; // Kling 试听 URL
|
||||
status: 'pending' | 'processing' | 'ready' | 'failed';
|
||||
createdAt: string;
|
||||
}
|
||||
|
||||
export interface AvatarMaterial {
|
||||
id: string;
|
||||
name: string;
|
||||
videoUrl: string; // 七牛云视频 URL
|
||||
createdAt: string;
|
||||
}
|
||||
|
||||
// ====================== 音频文件管理类型 ======================
|
||||
|
||||
export interface AudioMeta {
|
||||
@@ -90,20 +113,40 @@ export interface AudioMeta {
|
||||
|
||||
/** 获取预设音色列表 */
|
||||
export async function getVoiceList(): Promise<VoiceInfo[]> {
|
||||
const data = await client.get<{ voiceId: string; name: string; description: string; recommended: boolean }[]>('/voice/voices');
|
||||
const data = await client.get<{ voiceId: string; name: string; description: string; recommended: boolean; previewUrl?: string; language?: string }[]>('/voice/voices');
|
||||
return data.map(v => ({
|
||||
voiceId: v.voiceId,
|
||||
name: v.name,
|
||||
description: v.description,
|
||||
recommended: v.recommended,
|
||||
language: v.language,
|
||||
previewUrl: v.previewUrl,
|
||||
}));
|
||||
}
|
||||
|
||||
/** 同步 TTS 合成(返回 base64) */
|
||||
/** 同步 TTS 合成(返回音频 URL) */
|
||||
export async function synthesizeTTS(request: TTSSynthesizeRequest): Promise<TTSResult> {
|
||||
return client.post<TTSResult>('/voice/synthesize', request);
|
||||
}
|
||||
|
||||
/** 上传音频文件到七牛云 */
|
||||
export async function uploadAudio(file: File): Promise<string> {
|
||||
const formData = new FormData();
|
||||
formData.append('file', file);
|
||||
formData.append('file_type', 'audio');
|
||||
const result = await client.postForm<{ url: string; key: string }>('/voice/upload', formData);
|
||||
return result.url;
|
||||
}
|
||||
|
||||
/** 上传视频文件到七牛云 */
|
||||
export async function uploadVideo(file: File): Promise<string> {
|
||||
const formData = new FormData();
|
||||
formData.append('file', file);
|
||||
formData.append('file_type', 'video');
|
||||
const result = await client.postForm<{ url: string; key: string }>('/voice/upload', formData);
|
||||
return result.url;
|
||||
}
|
||||
|
||||
/** 批量 TTS 合成 */
|
||||
export async function synthesizeBatchTTS(request: TTSBatchRequest): Promise<TTSBatchResult> {
|
||||
return client.post<TTSBatchResult>('/voice/synthesize-batch', request);
|
||||
@@ -126,6 +169,75 @@ export async function cloneAndWait(request: VoiceCloneSubmitRequest, pollInterva
|
||||
return client.post<VoiceCloneTaskResponse>('/voice/clone/clone-and-wait', { ...request, pollInterval });
|
||||
}
|
||||
|
||||
// ====================== 素材库 API ======================
|
||||
|
||||
/** 从本地加载音色素材库 */
|
||||
export async function loadVoiceMaterials(): Promise<VoiceMaterial[]> {
|
||||
const result = await invoke<{ code: number; data?: VoiceMaterial[]; message: string }>('load_voice_materials');
|
||||
if (result.code !== 200) {
|
||||
throw new Error(result.message || '加载素材库失败');
|
||||
}
|
||||
return result.data || [];
|
||||
}
|
||||
|
||||
/** 保存音色素材到本地 */
|
||||
export async function saveVoiceMaterial(material: VoiceMaterial): Promise<void> {
|
||||
const result = await invoke<{ code: number; message: string }>('save_voice_material', { args: material });
|
||||
if (result.code !== 200) {
|
||||
throw new Error(result.message || '保存素材失败');
|
||||
}
|
||||
}
|
||||
|
||||
/** 删除本地音色素材 */
|
||||
export async function deleteVoiceMaterial(materialId: string): Promise<void> {
|
||||
const result = await invoke<{ code: number; message: string }>('delete_voice_material_cmd', { id: materialId });
|
||||
if (result.code !== 200) {
|
||||
throw new Error(result.message || '删除素材失败');
|
||||
}
|
||||
}
|
||||
|
||||
// ====================== 视频素材库 API(复用 avatar.json)=====================
|
||||
|
||||
export async function loadAvatarMaterials(): Promise<AvatarMaterial[]> {
|
||||
const result = await invoke<{ code: number; data?: AvatarMaterial[]; message: string }>('load_avatars_list');
|
||||
if (result.code !== 200) {
|
||||
throw new Error(result.message || '加载视频素材失败');
|
||||
}
|
||||
// avatar.json 存的可能是数组,需要转换
|
||||
const raw = result.data || [];
|
||||
return raw.map((item: AvatarMaterial) => ({
|
||||
id: item.id,
|
||||
name: item.name,
|
||||
videoUrl: item.videoUrl,
|
||||
createdAt: item.createdAt,
|
||||
}));
|
||||
}
|
||||
|
||||
export async function saveAvatarMaterial(material: AvatarMaterial): Promise<void> {
|
||||
const list = await loadAvatarMaterials();
|
||||
const exists = list.findIndex(m => m.id === material.id);
|
||||
const updated = exists >= 0
|
||||
? list.map((m, i) => i === exists ? material : m)
|
||||
: [material, ...list];
|
||||
const result = await invoke<{ code: number; message: string }>('save_avatars_list', {
|
||||
avatars: updated,
|
||||
});
|
||||
if (result.code !== 200) {
|
||||
throw new Error(result.message || '保存视频素材失败');
|
||||
}
|
||||
}
|
||||
|
||||
export async function deleteAvatarMaterial(materialId: string): Promise<void> {
|
||||
const list = await loadAvatarMaterials();
|
||||
const filtered = list.filter(m => m.id !== materialId);
|
||||
const result = await invoke<{ code: number; message: string }>('save_avatars_list', {
|
||||
avatars: filtered,
|
||||
});
|
||||
if (result.code !== 200) {
|
||||
throw new Error(result.message || '删除视频素材失败');
|
||||
}
|
||||
}
|
||||
|
||||
// ====================== 本地音频文件管理(Tauri IPC) ======================
|
||||
|
||||
/** 保存音频文件到本地 */
|
||||
|
||||
@@ -57,4 +57,6 @@ export interface ScriptShot {
|
||||
alignmentResult?: AlignmentResult; // 字幕打轴结果
|
||||
burnedVideoPath?: string; // 压制字幕后的视频路径
|
||||
burnedAt?: number; // 压制字幕的时间戳
|
||||
audioPath?: string; // 本地配音音频文件路径
|
||||
audioUrl?: string; // 七牛云配音音频 URL
|
||||
}
|
||||
|
||||
@@ -20,7 +20,7 @@ const navItems: NavItem[] = [
|
||||
label: '内容管理',
|
||||
icon: 'M19 11H5m14 0a2 2 0 012 2v6a2 2 0 01-2 2H5a2 2 0 01-2-2v-6a2 2 0 012-2m14 0V9a2 2 0 00-2-2M5 11V9a2 2 0 012-2m0 0V5a2 2 0 012-2h6a2 2 0 012 2v2M7 7h10',
|
||||
children: [
|
||||
{ id: 'avatar-clone', label: '形象克隆' },
|
||||
{ id: 'voice-material', label: '我的素材' },
|
||||
{ id: 'my-works', label: '我的作品' },
|
||||
],
|
||||
},
|
||||
|
||||
@@ -25,15 +25,14 @@ interface ShotStatsProps {
|
||||
* 分镜统计组件
|
||||
*/
|
||||
export const ShotStats: React.FC<ShotStatsProps> = ({ stats, className = '' }) => {
|
||||
const { totalWords, totalDuration, segmentCount /* , emptyShotCount */ } = stats;
|
||||
const { totalWords, totalDuration, segmentCount, emptyShotCount } = stats;
|
||||
|
||||
return (
|
||||
<div className={`shot-stats ${className}`}>
|
||||
<StatItem icon={<WordIcon />} value={totalWords} label="总字数" />
|
||||
<StatItem icon={<DurationIcon />} value={`${totalDuration}s`} label="预计时长" />
|
||||
<StatItem icon={<SegmentIcon />} value={segmentCount} label="分镜数" />
|
||||
{/* 空镜功能暂时禁用 */}
|
||||
{/* <StatItem icon={<EmptyShotIcon />} value={emptyShotCount} label="空镜数" /> */}
|
||||
<StatItem icon={<EmptyShotIcon />} value={emptyShotCount} label="空镜数" />
|
||||
</div>
|
||||
);
|
||||
};
|
||||
@@ -79,13 +78,12 @@ const SegmentIcon = () => (
|
||||
</svg>
|
||||
);
|
||||
|
||||
/* 空镜功能暂时禁用 */
|
||||
// const EmptyShotIcon = () => (
|
||||
// <svg width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2">
|
||||
// <rect x="3" y="3" width="18" height="18" rx="2" ry="2" />
|
||||
// <circle cx="8.5" cy="8.5" r="1.5" />
|
||||
// <polyline points="21 15 16 10 5 21" />
|
||||
// </svg>
|
||||
// );
|
||||
const EmptyShotIcon = () => (
|
||||
<svg width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2">
|
||||
<rect x="3" y="3" width="18" height="18" rx="2" ry="2" />
|
||||
<circle cx="8.5" cy="8.5" r="1.5" />
|
||||
<polyline points="21 15 16 10 5 21" />
|
||||
</svg>
|
||||
);
|
||||
|
||||
export default ShotStats;
|
||||
|
||||
@@ -1282,3 +1282,40 @@
|
||||
padding: 0 var(--spacing-sm);
|
||||
font-size: var(--font-xs);
|
||||
}
|
||||
|
||||
/* Empty State */
|
||||
.empty-state {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
gap: var(--spacing-md);
|
||||
background: var(--bg-card);
|
||||
border: 2px dashed var(--border-color);
|
||||
border-radius: var(--radius-xl);
|
||||
}
|
||||
|
||||
.empty-state-icon {
|
||||
width: 80px;
|
||||
height: 80px;
|
||||
border-radius: var(--radius-full);
|
||||
background: var(--bg-input);
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
color: var(--text-placeholder);
|
||||
}
|
||||
|
||||
.empty-state-title {
|
||||
font-size: var(--font-base);
|
||||
font-weight: 600;
|
||||
color: var(--text-secondary);
|
||||
}
|
||||
|
||||
.empty-state-desc {
|
||||
font-size: var(--font-sm);
|
||||
color: var(--text-tertiary);
|
||||
text-align: center;
|
||||
max-width: 280px;
|
||||
line-height: 1.6;
|
||||
}
|
||||
|
||||
@@ -0,0 +1,635 @@
|
||||
/**
|
||||
* 素材库页面
|
||||
* ==========
|
||||
*
|
||||
* 管理音频素材(音色克隆)和视频素材。
|
||||
* - 音频:上传 mp3/wav → 七牛云 → Kling 音色克隆 → voices.json
|
||||
* - 视频:上传 mp4/mov → 七牛云 → avatar.json
|
||||
*/
|
||||
|
||||
import { useState, useEffect, useRef, useCallback } from 'react';
|
||||
import { useVoiceStore } from '../../store/voiceStore';
|
||||
import { toast } from '../../store/uiStore';
|
||||
import { useProgressStore } from '../../store/progressStore';
|
||||
import * as voiceApi from '../../api/modules/voice';
|
||||
import Modal from '../../components/Modal/Modal';
|
||||
import ConfirmModal from '../../components/Modal/ConfirmModal';
|
||||
import './ContentManagement.css';
|
||||
|
||||
export default function VoiceMaterialLibrary() {
|
||||
const [activeTab, setActiveTab] = useState<'audio' | 'video'>('audio');
|
||||
const [uploadModalOpen, setUploadModalOpen] = useState(false);
|
||||
const [uploadName, setUploadName] = useState('');
|
||||
const [selectedFile, setSelectedFile] = useState<File | null>(null);
|
||||
|
||||
// 重命名状态
|
||||
const [editingId, setEditingId] = useState<string | null>(null);
|
||||
const [editingName, setEditingName] = useState('');
|
||||
|
||||
// 删除确认状态
|
||||
const [deleteModalOpen, setDeleteModalOpen] = useState(false);
|
||||
const [deleteTarget, setDeleteTarget] = useState<{ id: string; name: string; type: 'audio' | 'video' } | null>(null);
|
||||
|
||||
const fileInputRef = useRef<HTMLInputElement>(null);
|
||||
const pollingIds = useRef<Set<string>>(new Set());
|
||||
|
||||
const {
|
||||
voiceMaterials,
|
||||
avatarMaterials,
|
||||
isLoadingMaterials,
|
||||
isLoadingAvatarMaterials,
|
||||
loadVoiceMaterials,
|
||||
loadAvatarMaterials,
|
||||
addVoiceMaterial,
|
||||
addAvatarMaterial,
|
||||
renameVoiceMaterial,
|
||||
renameAvatarMaterial,
|
||||
deleteVoiceMaterial,
|
||||
deleteAvatarMaterial,
|
||||
updateVoiceMaterialStatus,
|
||||
} = useVoiceStore();
|
||||
|
||||
// 加载数据
|
||||
useEffect(() => {
|
||||
loadVoiceMaterials();
|
||||
loadAvatarMaterials();
|
||||
}, []);
|
||||
|
||||
// 轮询 pending/processing 状态的音频素材
|
||||
useEffect(() => {
|
||||
const pending = voiceMaterials.filter(m => m.status === 'pending' || m.status === 'processing');
|
||||
const intervals: ReturnType<typeof setInterval>[] = [];
|
||||
|
||||
for (const item of pending) {
|
||||
if (pollingIds.current.has(item.id)) continue;
|
||||
pollingIds.current.add(item.id);
|
||||
|
||||
const interval = setInterval(async () => {
|
||||
try {
|
||||
const result = await voiceApi.queryCloneTask(item.id);
|
||||
if (result.status === 'succeeded') {
|
||||
updateVoiceMaterialStatus(item.id, 'ready', result.voiceId, result.trialUrl);
|
||||
clearInterval(interval);
|
||||
pollingIds.current.delete(item.id);
|
||||
} else if (result.status === 'failed') {
|
||||
updateVoiceMaterialStatus(item.id, 'failed');
|
||||
clearInterval(interval);
|
||||
pollingIds.current.delete(item.id);
|
||||
}
|
||||
} catch (err) {
|
||||
console.error('[VoiceMaterialLibrary] 轮询克隆状态失败:', err);
|
||||
}
|
||||
}, 5000);
|
||||
|
||||
intervals.push(interval);
|
||||
|
||||
// 10 分钟后自动停止轮询
|
||||
setTimeout(() => {
|
||||
clearInterval(interval);
|
||||
pollingIds.current.delete(item.id);
|
||||
}, 600000);
|
||||
}
|
||||
|
||||
return () => intervals.forEach(clearInterval);
|
||||
}, [voiceMaterials, updateVoiceMaterialStatus]);
|
||||
|
||||
// 音频文件验证
|
||||
const validateAudioFile = (file: File): Promise<{ valid: boolean; error?: string }> => {
|
||||
return new Promise(resolve => {
|
||||
const allowedExts = ['.mp3', '.wav'];
|
||||
const ext = file.name.substring(file.name.lastIndexOf('.')).toLowerCase();
|
||||
|
||||
if (!allowedExts.includes(ext)) {
|
||||
resolve({ valid: false, error: '仅支持 MP3、WAV 格式' });
|
||||
return;
|
||||
}
|
||||
|
||||
const audio = document.createElement('audio');
|
||||
audio.preload = 'metadata';
|
||||
|
||||
audio.onloadedmetadata = () => {
|
||||
const duration = audio.duration;
|
||||
URL.revokeObjectURL(audio.src);
|
||||
if (duration < 5) {
|
||||
resolve({ valid: false, error: `音频时长 ${duration.toFixed(1)} 秒,要求至少 5 秒` });
|
||||
return;
|
||||
}
|
||||
if (duration > 30) {
|
||||
resolve({ valid: false, error: `音频时长 ${duration.toFixed(1)} 秒,要求不超过 30 秒` });
|
||||
return;
|
||||
}
|
||||
resolve({ valid: true });
|
||||
};
|
||||
|
||||
audio.onerror = () => {
|
||||
URL.revokeObjectURL(audio.src);
|
||||
resolve({ valid: false, error: '无法读取音频文件' });
|
||||
};
|
||||
|
||||
setTimeout(() => {
|
||||
URL.revokeObjectURL(audio.src);
|
||||
resolve({ valid: false, error: '读取音频超时' });
|
||||
}, 8000);
|
||||
|
||||
audio.src = URL.createObjectURL(file);
|
||||
});
|
||||
};
|
||||
|
||||
// 视频文件验证
|
||||
const validateVideoFile = (file: File): Promise<{ valid: boolean; error?: string }> => {
|
||||
return new Promise(resolve => {
|
||||
const allowedExts = ['.mp4', '.mov'];
|
||||
const ext = file.name.substring(file.name.lastIndexOf('.')).toLowerCase();
|
||||
|
||||
if (!allowedExts.includes(ext)) {
|
||||
resolve({ valid: false, error: '仅支持 MP4、MOV 格式' });
|
||||
return;
|
||||
}
|
||||
|
||||
const video = document.createElement('video');
|
||||
video.preload = 'metadata';
|
||||
|
||||
video.onloadedmetadata = () => {
|
||||
const duration = video.duration;
|
||||
URL.revokeObjectURL(video.src);
|
||||
if (duration < 3) {
|
||||
resolve({ valid: false, error: `视频时长 ${duration.toFixed(1)} 秒,要求至少 3 秒` });
|
||||
return;
|
||||
}
|
||||
if (duration > 10) {
|
||||
resolve({ valid: false, error: `视频时长 ${duration.toFixed(1)} 秒,要求不超过 10 秒` });
|
||||
return;
|
||||
}
|
||||
resolve({ valid: true });
|
||||
};
|
||||
|
||||
video.onerror = () => {
|
||||
URL.revokeObjectURL(video.src);
|
||||
resolve({ valid: false, error: '无法读取视频文件' });
|
||||
};
|
||||
|
||||
setTimeout(() => {
|
||||
URL.revokeObjectURL(video.src);
|
||||
resolve({ valid: false, error: '读取视频超时' });
|
||||
}, 8000);
|
||||
|
||||
video.src = URL.createObjectURL(file);
|
||||
});
|
||||
};
|
||||
|
||||
// 文件选择
|
||||
const handleFileSelect = useCallback(async (e: React.ChangeEvent<HTMLInputElement>) => {
|
||||
const file = e.target.files?.[0];
|
||||
if (!file) return;
|
||||
|
||||
const validation = activeTab === 'audio'
|
||||
? await validateAudioFile(file)
|
||||
: await validateVideoFile(file);
|
||||
|
||||
if (!validation.valid) {
|
||||
toast.error(validation.error || '文件验证失败');
|
||||
e.target.value = '';
|
||||
return;
|
||||
}
|
||||
|
||||
setSelectedFile(file);
|
||||
}, [activeTab]);
|
||||
|
||||
// 上传处理
|
||||
const handleUpload = useCallback(async () => {
|
||||
if (!uploadName.trim() || !selectedFile) return;
|
||||
|
||||
const progress = useProgressStore.getState();
|
||||
setUploadModalOpen(false);
|
||||
|
||||
if (activeTab === 'audio') {
|
||||
progress.show('上传素材');
|
||||
try {
|
||||
progress.update('文件校验中...');
|
||||
await addVoiceMaterial(selectedFile, uploadName.trim());
|
||||
progress.update('正在生成专属音色...');
|
||||
progress.success('提交成功');
|
||||
} catch (err) {
|
||||
progress.error(err instanceof Error ? err.message : '上传失败');
|
||||
}
|
||||
} else {
|
||||
progress.show('上传素材');
|
||||
try {
|
||||
progress.update('文件校验中...');
|
||||
await addAvatarMaterial(selectedFile, uploadName.trim());
|
||||
progress.success('上传成功');
|
||||
} catch (err) {
|
||||
progress.error(err instanceof Error ? err.message : '上传失败');
|
||||
}
|
||||
}
|
||||
|
||||
setUploadName('');
|
||||
setSelectedFile(null);
|
||||
}, [activeTab, uploadName, selectedFile, addVoiceMaterial, addAvatarMaterial]);
|
||||
|
||||
// 删除处理
|
||||
const openDeleteModal = (id: string, name: string, type: 'audio' | 'video') => {
|
||||
setDeleteTarget({ id, name, type });
|
||||
setDeleteModalOpen(true);
|
||||
};
|
||||
|
||||
const handleConfirmDelete = useCallback(async () => {
|
||||
if (!deleteTarget) return;
|
||||
try {
|
||||
if (deleteTarget.type === 'audio') {
|
||||
await deleteVoiceMaterial(deleteTarget.id);
|
||||
} else {
|
||||
await deleteAvatarMaterial(deleteTarget.id);
|
||||
}
|
||||
toast.success('已删除');
|
||||
} catch {
|
||||
toast.error('删除失败');
|
||||
} finally {
|
||||
setDeleteModalOpen(false);
|
||||
setDeleteTarget(null);
|
||||
}
|
||||
}, [deleteTarget, deleteVoiceMaterial, deleteAvatarMaterial]);
|
||||
|
||||
const statusLabel = (status: string) => {
|
||||
switch (status) {
|
||||
case 'ready': return '可用';
|
||||
case 'pending': return '等待中';
|
||||
case 'processing': return '克隆中...';
|
||||
case 'failed': return '失败';
|
||||
default: return status;
|
||||
}
|
||||
};
|
||||
|
||||
const statusColor = (status: string) => {
|
||||
switch (status) {
|
||||
case 'ready': return '#22c55e';
|
||||
case 'pending': return 'var(--text-secondary)';
|
||||
case 'processing': return '#f59e0b';
|
||||
case 'failed': return '#ef4444';
|
||||
default: return 'var(--text-secondary)';
|
||||
}
|
||||
};
|
||||
|
||||
const startRename = (id: string, currentName: string) => {
|
||||
setEditingId(id);
|
||||
setEditingName(currentName);
|
||||
};
|
||||
|
||||
const cancelRename = () => {
|
||||
setEditingId(null);
|
||||
setEditingName('');
|
||||
};
|
||||
|
||||
const confirmRename = useCallback(async () => {
|
||||
if (!editingId || !editingName.trim()) {
|
||||
cancelRename();
|
||||
return;
|
||||
}
|
||||
try {
|
||||
if (activeTab === 'audio') {
|
||||
await renameVoiceMaterial(editingId, editingName.trim());
|
||||
} else {
|
||||
await renameAvatarMaterial(editingId, editingName.trim());
|
||||
}
|
||||
setEditingId(null);
|
||||
setEditingName('');
|
||||
} catch {
|
||||
toast.error('重命名失败');
|
||||
}
|
||||
}, [editingId, editingName, activeTab, renameVoiceMaterial, renameAvatarMaterial]);
|
||||
|
||||
return (
|
||||
<div className="content-page">
|
||||
<div className="content-header">
|
||||
<h2>我的素材</h2>
|
||||
</div>
|
||||
|
||||
{/* Tab + 上传按钮 */}
|
||||
<div
|
||||
style={{ display: 'flex', justifyContent: 'space-between', alignItems: 'center', gap: 16, marginBottom: 'var(--spacing-md)' }}
|
||||
>
|
||||
<div style={{ display: 'flex', gap: 0, borderBottom: 0 }}>
|
||||
<button
|
||||
className={`voice-tab ${activeTab === 'audio' ? 'active' : ''}`}
|
||||
onClick={() => { setActiveTab('audio'); setSelectedFile(null); setUploadName(''); }}
|
||||
>
|
||||
音频素材 ({voiceMaterials.length})
|
||||
</button>
|
||||
<button
|
||||
className={`voice-tab ${activeTab === 'video' ? 'active' : ''}`}
|
||||
onClick={() => { setActiveTab('video'); setSelectedFile(null); setUploadName(''); }}
|
||||
>
|
||||
视频素材 ({avatarMaterials.length})
|
||||
</button>
|
||||
</div>
|
||||
<button
|
||||
className="btn btn-primary"
|
||||
onClick={() => setUploadModalOpen(true)}
|
||||
style={{
|
||||
display: 'inline-flex',
|
||||
alignItems: 'center',
|
||||
gap: 6,
|
||||
borderRadius: 6,
|
||||
flexShrink: 0,
|
||||
}}
|
||||
>
|
||||
<svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round">
|
||||
<path d="M21 15v4a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2v-4" />
|
||||
<polyline points="17 8 12 3 7 8" />
|
||||
<line x1="12" y1="3" x2="12" y2="15" />
|
||||
</svg>
|
||||
上传{activeTab === 'audio' ? '音频' : '视频'}素材
|
||||
</button>
|
||||
</div>
|
||||
|
||||
{/* 上传弹窗 */}
|
||||
<Modal
|
||||
open={uploadModalOpen}
|
||||
onClose={() => setUploadModalOpen(false)}
|
||||
title={`上传${activeTab === 'audio' ? '音频' : '视频'}素材`}
|
||||
width="480px"
|
||||
>
|
||||
<div style={{ display: 'flex', flexDirection: 'column', gap: 16 }}>
|
||||
<div>
|
||||
<label style={{ fontSize: 'var(--font-sm)', fontWeight: 500, marginBottom: 8, display: 'block' }}>
|
||||
素材名称
|
||||
</label>
|
||||
<input
|
||||
type="text"
|
||||
className="input"
|
||||
placeholder={`例如:我的${activeTab === 'audio' ? '声音' : '形象'}`}
|
||||
value={uploadName}
|
||||
onChange={e => setUploadName(e.target.value)}
|
||||
style={{ width: '100%' }}
|
||||
/>
|
||||
</div>
|
||||
|
||||
<div>
|
||||
<label style={{ fontSize: 'var(--font-sm)', fontWeight: 500, marginBottom: 8, display: 'block' }}>
|
||||
选择文件
|
||||
</label>
|
||||
<div
|
||||
style={{
|
||||
border: '2px dashed var(--border-color)',
|
||||
borderRadius: 'var(--radius-md)',
|
||||
padding: 'var(--spacing-xl)',
|
||||
textAlign: 'center',
|
||||
cursor: 'pointer',
|
||||
transition: 'all var(--transition-fast)',
|
||||
}}
|
||||
onClick={() => fileInputRef.current?.click()}
|
||||
onMouseEnter={e => { e.currentTarget.style.borderColor = 'var(--primary)'; }}
|
||||
onMouseLeave={e => { e.currentTarget.style.borderColor = 'var(--border-color)'; }}
|
||||
>
|
||||
<input
|
||||
ref={fileInputRef}
|
||||
type="file"
|
||||
accept={activeTab === 'audio' ? '.mp3,.wav' : '.mp4,.mov'}
|
||||
onChange={handleFileSelect}
|
||||
style={{ display: 'none' }}
|
||||
/>
|
||||
{selectedFile ? (
|
||||
<div>
|
||||
<div style={{ fontWeight: 500, fontSize: 'var(--font-sm)' }}>{selectedFile.name}</div>
|
||||
<div style={{ fontSize: 'var(--font-xs)', color: 'var(--text-secondary)', marginTop: 4 }}>
|
||||
{(selectedFile.size / 1024 / 1024).toFixed(2)} MB
|
||||
</div>
|
||||
</div>
|
||||
) : (
|
||||
<div style={{ color: 'var(--text-secondary)' }}>
|
||||
<div style={{ fontSize: 'var(--font-sm)' }}>点击选择文件</div>
|
||||
<div style={{ fontSize: 'var(--font-xs)', marginTop: 6, lineHeight: 1.6 }}>
|
||||
{activeTab === 'audio'
|
||||
? '支持 MP3 / WAV,人声干净无杂音,时长 5-30 秒'
|
||||
: '支持 MP4 / MOV,人物正面视频,时长 3-10 秒'}
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div style={{ display: 'flex', gap: 12, justifyContent: 'flex-end' }}>
|
||||
<button className="btn btn-secondary" onClick={() => setUploadModalOpen(false)}>取消</button>
|
||||
<button
|
||||
className="btn btn-primary"
|
||||
onClick={handleUpload}
|
||||
disabled={!uploadName.trim() || !selectedFile}
|
||||
>
|
||||
确认上传
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
</Modal>
|
||||
|
||||
{/* 音频列表 */}
|
||||
{activeTab === 'audio' && (
|
||||
isLoadingMaterials ? (
|
||||
<p style={{ color: 'var(--text-secondary)' }}>加载中...</p>
|
||||
) : (
|
||||
<div className="voice-list" style={{ display: 'flex', flexDirection: 'column', gap: 12, flex: 1 }}>
|
||||
{voiceMaterials.length === 0 && (
|
||||
<div className="empty-state" style={{ minHeight: 300, flex: 1 }}>
|
||||
<div className="empty-state-icon">
|
||||
<svg width="48" height="48" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="1" strokeLinecap="round" strokeLinejoin="round">
|
||||
<path d="M12 1a3 3 0 0 0-3 3v8a3 3 0 0 0 6 0V4a3 3 0 0 0-3-3z" />
|
||||
<path d="M19 10v2a7 7 0 0 1-14 0v-2" />
|
||||
<line x1="12" y1="19" x2="12" y2="23" />
|
||||
<line x1="8" y1="23" x2="16" y2="23" />
|
||||
</svg>
|
||||
</div>
|
||||
<p className="empty-state-title">暂无音频素材</p>
|
||||
<p className="empty-state-desc">点击右上角按钮上传音频素材,<br />上传后将自动进行音色克隆</p>
|
||||
</div>
|
||||
)}
|
||||
{voiceMaterials.map(m => (
|
||||
<div
|
||||
key={m.id}
|
||||
className="voice-row"
|
||||
style={{
|
||||
padding: 'var(--spacing-sm) var(--spacing-md)',
|
||||
borderRadius: 'var(--radius-md)',
|
||||
border: '1px solid var(--border-color)',
|
||||
background: 'var(--bg-card)',
|
||||
}}
|
||||
>
|
||||
<div style={{ display: 'flex', justifyContent: 'space-between', alignItems: 'center', gap: 12 }}>
|
||||
<div style={{ flex: 1, minWidth: 0 }}>
|
||||
{editingId === m.id ? (
|
||||
<input
|
||||
type="text"
|
||||
className="input"
|
||||
value={editingName}
|
||||
onChange={e => setEditingName(e.target.value)}
|
||||
onKeyDown={e => {
|
||||
if (e.key === 'Enter') confirmRename();
|
||||
if (e.key === 'Escape') cancelRename();
|
||||
}}
|
||||
onBlur={confirmRename}
|
||||
autoFocus
|
||||
style={{ width: '100%', height: 28, padding: '2px 8px', fontSize: 'var(--font-sm)' }}
|
||||
/>
|
||||
) : (
|
||||
<div style={{ fontWeight: 500, fontSize: 'var(--font-sm)', overflow: 'hidden', textOverflow: 'ellipsis', whiteSpace: 'nowrap' }}>
|
||||
{m.name}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
<div style={{ display: 'flex', alignItems: 'center', gap: 8, flexShrink: 0 }}>
|
||||
<span
|
||||
style={{
|
||||
fontSize: 'var(--font-xs)',
|
||||
color: statusColor(m.status),
|
||||
background: `${statusColor(m.status)}15`,
|
||||
padding: '2px 8px',
|
||||
borderRadius: 'var(--radius-sm)',
|
||||
whiteSpace: 'nowrap',
|
||||
}}
|
||||
>
|
||||
{statusLabel(m.status)}
|
||||
</span>
|
||||
<button
|
||||
className="btn btn-ghost"
|
||||
style={{ padding: '4px', width: 28, height: 28 }}
|
||||
onClick={() => startRename(m.id, m.name)}
|
||||
title="重命名"
|
||||
>
|
||||
<svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round">
|
||||
<path d="M11 4H4a2 2 0 0 0-2 2v14a2 2 0 0 0 2 2h14a2 2 0 0 0 2-2v-7" />
|
||||
<path d="M18.5 2.5a2.121 2.121 0 0 1 3 3L12 15l-4 1 1-4 9.5-9.5z" />
|
||||
</svg>
|
||||
</button>
|
||||
<button
|
||||
className="btn btn-ghost"
|
||||
style={{ padding: '4px', width: 28, height: 28, color: 'var(--text-tertiary)' }}
|
||||
onClick={() => openDeleteModal(m.id, m.name, 'audio')}
|
||||
title="删除"
|
||||
>
|
||||
<svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round">
|
||||
<polyline points="3 6 5 6 21 6" />
|
||||
<path d="M19 6v14a2 2 0 0 1-2 2H7a2 2 0 0 1-2-2V6m3 0V4a2 2 0 0 1 2-2h4a2 2 0 0 1 2 2v2" />
|
||||
</svg>
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
{m.status === 'ready' && m.trialUrl && (
|
||||
<audio src={m.trialUrl} controls style={{ width: '100%', marginTop: 6, height: 28 }} />
|
||||
)}
|
||||
</div>
|
||||
))}
|
||||
</div>
|
||||
)
|
||||
)}
|
||||
|
||||
{/* 视频列表 */}
|
||||
{activeTab === 'video' && (
|
||||
isLoadingAvatarMaterials ? (
|
||||
<p style={{ color: 'var(--text-secondary)' }}>加载中...</p>
|
||||
) : (
|
||||
<div style={{ display: 'grid', gridTemplateColumns: 'repeat(auto-fill, minmax(200px, 1fr))', gap: 16, flex: 1 }}>
|
||||
{avatarMaterials.length === 0 && (
|
||||
<div className="empty-state" style={{ gridColumn: '1 / -1', minHeight: 300, flex: 1 }}>
|
||||
<div className="empty-state-icon">
|
||||
<svg width="48" height="48" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="1" strokeLinecap="round" strokeLinejoin="round">
|
||||
<rect x="2" y="2" width="20" height="20" rx="2.18" ry="2.18" />
|
||||
<line x1="7" y1="2" x2="7" y2="22" />
|
||||
<line x1="17" y1="2" x2="17" y2="22" />
|
||||
<line x1="2" y1="12" x2="22" y2="12" />
|
||||
<line x1="2" y1="7" x2="7" y2="7" />
|
||||
<line x1="2" y1="17" x2="7" y2="17" />
|
||||
<line x1="17" y1="17" x2="22" y2="17" />
|
||||
<line x1="17" y1="7" x2="22" y2="7" />
|
||||
</svg>
|
||||
</div>
|
||||
<p className="empty-state-title">暂无视频素材</p>
|
||||
<p className="empty-state-desc">点击右上角按钮上传视频素材,<br />用于数字人形象制作</p>
|
||||
</div>
|
||||
)}
|
||||
{avatarMaterials.map(m => (
|
||||
<div
|
||||
key={m.id}
|
||||
style={{
|
||||
padding: 'var(--spacing-sm)',
|
||||
borderRadius: 'var(--radius-md)',
|
||||
border: '1px solid var(--border-color)',
|
||||
background: 'var(--bg-card)',
|
||||
}}
|
||||
>
|
||||
<video
|
||||
src={m.videoUrl}
|
||||
controls
|
||||
style={{
|
||||
width: '100%',
|
||||
borderRadius: 'var(--radius-sm)',
|
||||
aspectRatio: '9/16',
|
||||
objectFit: 'cover',
|
||||
background: '#000',
|
||||
}}
|
||||
/>
|
||||
<div style={{ display: 'flex', justifyContent: 'space-between', alignItems: 'center', marginTop: 6, gap: 8 }}>
|
||||
<div style={{ flex: 1, minWidth: 0 }}>
|
||||
{editingId === m.id ? (
|
||||
<input
|
||||
type="text"
|
||||
className="input"
|
||||
value={editingName}
|
||||
onChange={e => setEditingName(e.target.value)}
|
||||
onKeyDown={e => {
|
||||
if (e.key === 'Enter') confirmRename();
|
||||
if (e.key === 'Escape') cancelRename();
|
||||
}}
|
||||
onBlur={confirmRename}
|
||||
autoFocus
|
||||
style={{ width: '100%', height: 26, padding: '2px 8px', fontSize: 'var(--font-sm)' }}
|
||||
/>
|
||||
) : (
|
||||
<span style={{ fontWeight: 500, fontSize: 'var(--font-sm)', overflow: 'hidden', textOverflow: 'ellipsis', whiteSpace: 'nowrap', display: 'block' }}>
|
||||
{m.name}
|
||||
</span>
|
||||
)}
|
||||
</div>
|
||||
<div style={{ display: 'flex', alignItems: 'center', gap: 4, flexShrink: 0 }}>
|
||||
<button
|
||||
className="btn btn-ghost"
|
||||
style={{ padding: '4px', width: 26, height: 26 }}
|
||||
onClick={() => startRename(m.id, m.name)}
|
||||
title="重命名"
|
||||
>
|
||||
<svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round">
|
||||
<path d="M11 4H4a2 2 0 0 0-2 2v14a2 2 0 0 0 2 2h14a2 2 0 0 0 2-2v-7" />
|
||||
<path d="M18.5 2.5a2.121 2.121 0 0 1 3 3L12 15l-4 1 1-4 9.5-9.5z" />
|
||||
</svg>
|
||||
</button>
|
||||
<button
|
||||
className="btn btn-ghost"
|
||||
style={{ padding: '4px', width: 26, height: 26, color: 'var(--text-tertiary)' }}
|
||||
onClick={() => openDeleteModal(m.id, m.name, 'video')}
|
||||
title="删除"
|
||||
>
|
||||
<svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round">
|
||||
<polyline points="3 6 5 6 21 6" />
|
||||
<path d="M19 6v14a2 2 0 0 1-2 2H7a2 2 0 0 1-2-2V6m3 0V4a2 2 0 0 1 2-2h4a2 2 0 0 1 2 2v2" />
|
||||
</svg>
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
))}
|
||||
</div>
|
||||
)
|
||||
)}
|
||||
|
||||
{/* 删除确认弹窗 */}
|
||||
<ConfirmModal
|
||||
open={deleteModalOpen}
|
||||
type="danger"
|
||||
title={<>确认删除素材 <strong>「{deleteTarget?.name}」</strong> 吗?</>}
|
||||
description="此操作不可撤销,素材将被永久删除"
|
||||
confirmText="确认删除"
|
||||
cancelText="取消"
|
||||
confirmButtonType="danger"
|
||||
onConfirm={handleConfirmDelete}
|
||||
onCancel={() => { setDeleteModalOpen(false); setDeleteTarget(null); }}
|
||||
/>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
@@ -438,63 +438,14 @@ export default function ScriptCreation() {
|
||||
|
||||
{expandedSegments.has(seg.id) && (
|
||||
<div className="segment-body">
|
||||
{/* 画面描述(分镜:scene,空镜:prompt) */}
|
||||
{/* 画面描述(分镜:scene,空镜:prompt)—— 只读 */}
|
||||
<div className="segment-field">
|
||||
<div className="segment-field-header">
|
||||
<span className="segment-field-label">画面描述</span>
|
||||
<div className="segment-field-actions">
|
||||
<button
|
||||
className="btn btn-ghost btn-xs"
|
||||
disabled={!!polishingState}
|
||||
onClick={() =>
|
||||
handlePolish(
|
||||
seg.id,
|
||||
typeof seg.scene === 'string' ? seg.scene : '',
|
||||
'scene'
|
||||
)
|
||||
}
|
||||
>
|
||||
{polishingState?.id === seg.id &&
|
||||
(polishingState?.type === 'scene' ||
|
||||
polishingState?.type === 'prompt') ? (
|
||||
<svg
|
||||
width="12"
|
||||
height="12"
|
||||
viewBox="0 0 24 24"
|
||||
fill="none"
|
||||
stroke="currentColor"
|
||||
strokeWidth="2"
|
||||
style={{ animation: 'spin 1s linear infinite' }}
|
||||
>
|
||||
<path d="M21 12a9 9 0 11-6.219-8.56" />
|
||||
</svg>
|
||||
) : null}
|
||||
润色
|
||||
</button>
|
||||
<button
|
||||
className="btn btn-ghost btn-xs"
|
||||
onClick={() => toggleEditField(seg.id, 'scene')}
|
||||
>
|
||||
{editingFields.has(`${seg.id}-scene`) ? '完成' : '编辑'}
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
{editingFields.has(`${seg.id}-scene`) ? (
|
||||
<textarea
|
||||
value={typeof seg.scene === 'string' ? seg.scene : ''}
|
||||
onChange={e => handleFieldChange(seg.id, 'scene', e.target.value)}
|
||||
rows={3}
|
||||
autoFocus
|
||||
placeholder="输入画面描述..."
|
||||
/>
|
||||
) : (
|
||||
<p
|
||||
className="segment-field-value"
|
||||
onClick={() => toggleEditField(seg.id, 'scene')}
|
||||
>
|
||||
{typeof seg.scene === 'string' ? seg.scene : '未设置画面描述'}
|
||||
</p>
|
||||
)}
|
||||
<p className="segment-field-value">
|
||||
{typeof seg.scene === 'string' ? seg.scene : '未设置画面描述'}
|
||||
</p>
|
||||
</div>
|
||||
|
||||
{/* 配音文案/画外音(两种类型都有) */}
|
||||
|
||||
@@ -809,7 +809,7 @@ export default function VideoGeneration() {
|
||||
style={{ marginTop: '8px' }}
|
||||
onClick={() => {
|
||||
setShowModal(false);
|
||||
navigate('avatar-clone');
|
||||
navigate('voice-material');
|
||||
}}
|
||||
>
|
||||
前往形象素材库进行克隆
|
||||
@@ -891,7 +891,7 @@ export default function VideoGeneration() {
|
||||
className="btn btn-ghost btn-sm"
|
||||
onClick={() => {
|
||||
setShowModal(false);
|
||||
navigate('avatar-clone');
|
||||
navigate('voice-material');
|
||||
}}
|
||||
>
|
||||
<svg
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
import ScriptCreation from './ScriptCreation';
|
||||
import AudioMixing from './AudioMixing';
|
||||
import VoiceDubbing from './VoiceDubbing';
|
||||
import VideoGeneration from './VideoGeneration';
|
||||
import SubtitleBurning from './SubtitleBurning';
|
||||
import CoverDesign from './CoverDesign';
|
||||
@@ -164,7 +164,7 @@ function VideoCreationContent() {
|
||||
case 1:
|
||||
return <ScriptCreation />;
|
||||
case 2:
|
||||
return <AudioMixing />;
|
||||
return <VoiceDubbing />;
|
||||
case 3:
|
||||
return <VideoGeneration />;
|
||||
case 4:
|
||||
|
||||
Reference in New Issue
Block a user