refactor: 从智影 Fork 重构为智剪,独立 Docker 基础设施,开发模式认证兜底

主要变更:
- 修复 /tasks/script 路由 404(去掉重复 prefix)
- 开发模式自动认证兜底(无需登录即可测试流程)
- Docker 基础设施独立化(共用 db/redis)
- 前端 API 端口改为 8081
- 新增 TTS/语音克隆、视频粗剪、音频混音等智剪功能
- 删除智影专属模块(avatar、model_usage、qiniu 上传等)
This commit is contained in:
小鱼开发
2026-04-21 12:35:50 +08:00
parent d05b17b61a
commit bb08d0f586
85 changed files with 13971 additions and 1882 deletions
Vendored
BIN
View File
Binary file not shown.
+115 -77
View File
@@ -1,48 +1,52 @@
<!-- From: /Users/0fun/work/ai-meijiaka/AGENTS.md -->
# 美家卡智影 (Meijiaka AI Video) - AI 视频创作平台
# 美家卡智剪 (Meijiaka Smart Cut) - AI 视频剪辑桌面应用
## 项目概述
美家卡智是一个 AI 驱动的视频创作桌面应用,采用 **Tauri + React + FastAPI** 混合架构。用户可以通过 AI 生成脚本、创建数字人视频,最终合成完整的营销视频。
美家卡智是一个 AI 驱动的**短视频剪辑**桌面应用,采用 **Tauri + React + FastAPI** 混合架构。本项目由「美家卡智影」Fork 而来,从"数字人生成"转向"AI 辅助长视频剪辑"方向:用户导入长视频素材,AI 根据脚本自动分镜切割,配合语音克隆/TTS 生成配音,最终合成带字幕的成品短视频。
### 核心功能
### 核心功能流程(6 步)
- **AI 脚本生成**: 基于 LLM 自动生成视频脚本和分镜
- **数字人视频**: 基于 KlingAI 创建数字人视频片段
- **字幕生成**: 基于火山引擎豆包语音自动生成字幕并压制到视
- **封面制作**: 提取视频首帧并叠加字幕样式生成封面
- **视频合成**: 本地 FFmpeg 处理视频拼接、音频混流、导出成品
- **形象克隆**: 基于 KlingAI 的自定义数字人形象管理
- **项目管理**: 项目数据本地 JSON 文件存储,认证状态云端同步
1. **脚本生成** (Step 1) - AI 生成或粘贴口播文案,自动拆分为带预估时长的分镜
2. **视频粗剪** (Step 2) - 导入长视频素材,按脚本时长自动切割为片段
3. **语音配音** (Step 3) - AI 声音克隆(KlingAI)+ TTS 合成,为每段生成配音音
4. **字幕压制** (Step 4) - 基于音频自动对齐时间轴,ASS 字幕渲染并压制到视频
5. **封面制作** (Step 5) - 提取视频首帧并叠加标题样式生成封面
6. **视频合成** (Step 6) - FFmpeg 拼接视频片段,替换原声为 TTS 音频,导出成品
### 技术架构
- **前端桌面壳**: Tauri 2.x + React 19 + TypeScript 5.8 + Vite 7
- **后端服务**: FastAPI (Python 3.13+) + PostgreSQL 15 + Redis 7
- **任务调度**: 自定义 Async Engine(基于 Redis 槽位管理),**非 Celery**
- **本地处理**: 嵌入式 FFmpeg,Rust 层封装视频/音频处理命令
- **AI 服务**: 火山方舟(LLM/字幕)、可灵 AI(TTS/声音克隆)、OpenAI 兼容接口
## 项目结构
```
ai-meijiaka/
meijiaka-zj/
├── python-api/ # FastAPI 后端服务(AI 代理 + 认证 + 任务调度)
│ ├── app/
│ │ ├── api/v1/ # API 路由 (REST): auth, script, ai_models, klingai,
│ │ │ # qiniu, video, avatar, system,
│ │ │ # caption, tasks
│ │ ├── api/v1/ # API 路由: auth, system, caption, voice
│ │ ├── ai/ # AI 模型路由、Provider、提示词模板
│ │ ├── core/ # 安全、配置加载、Token管理器、Redis客户端、异常处理
│ │ ├── crud/ # 数据访问层(users, model_usage, avatar
│ │ ├── crud/ # 数据访问层(users, avatar
│ │ ├── db/ # 数据库配置(PostgreSQL + asyncpg + SQLAlchemy 2.0
│ │ ├── models/ # SQLAlchemy 模型(users, model_usage_logs, avatars
│ │ ├── models/ # SQLAlchemy 模型(当前仅 users
│ │ ├── schemas/ # Pydantic 校验模型
│ │ ├── services/ # AI 服务代理、DTO标准化、七牛/字幕/视频服务
│ │ ├── scheduler/ # Async Engine 异步任务调度video, image, script,
│ │ │ # subtitle, copy, avatar_clone
│ │ ├── services/ # AI 服务代理、字幕/视频/TTS/声音克隆服务
│ │ ├── scheduler/ # Async Engine 异步任务调度
│ │ ├── config.py # Pydantic Settings 配置管理
│ │ └── main.py # FastAPI 入口(含生命周期管理)
│ ├── config/ # AI 模型配置文件(ai_models.yaml),支持热重载
│ ├── alembic/ # 数据库迁移
│ ├── scripts/ # 初始化/测试脚本
│ ├── alembic/ # 数据库迁移(当前无迁移文件)
│ ├── tests/ # 测试目录(当前仅 test_kling_tts.py
│ ├── pyproject.toml # Python 依赖和工具配置
│ ├── requirements.lock # uv 锁定依赖版本
│ ├── requirements.lock # uv 锁定依赖版本(如缺失需执行 make update-lock
│ ├── Makefile # 常用命令封装
│ ├── docker-compose.yml
── Dockerfile
── Dockerfile
│ └── .env.example # 环境变量模板
├── tauri-app/ # Tauri 桌面应用(业务数据本地存储)
│ ├── src/ # React 前端源码
@@ -55,6 +59,11 @@ ai-meijiaka/
│ │ │ └── ipc.ts # Tauri IPC 调用封装
│ │ ├── components/ # 可复用组件
│ │ ├── pages/ # 页面组件
│ │ │ ├── VideoCreation/ # 6 步视频创作流程
│ │ │ ├── ContentManagement/
│ │ │ ├── Settings/
│ │ │ ├── Profile/
│ │ │ └── Login/
│ │ ├── store/ # Zustand 状态管理(+ Immer + persist
│ │ ├── hooks/ # 自定义 React Hooks
│ │ ├── styles/ # 全局 CSS 变量、主题
@@ -65,10 +74,8 @@ ai-meijiaka/
│ │ │ ├── ffmpeg_cmd.rs # FFmpeg 命令封装
│ │ │ ├── video_processing.rs # 视频合成业务逻辑
│ │ │ ├── storage/ # 本地存储引擎(原子写入、文件锁、路径净化)
│ │ │ ├── commands/ # IPC 命令按领域拆分project/asset/auth/avatar
│ │ │ ├── commands/ # IPC 命令按领域拆分
│ │ │ ├── api_proxy.rs # Python API 代理转发
│ │ │ ├── auth.rs # 认证命令(已迁移至 commands/auth_state.rs
│ │ │ ├── avatar_cache.rs # 头像缓存管理
│ │ │ └── utils.rs # 通用工具函数
│ │ ├── Cargo.toml
│ │ ├── tauri.conf.json
@@ -78,19 +85,13 @@ ai-meijiaka/
│ ├── tsconfig.json
│ └── eslint.config.js
── docs/ # 项目文档
├── anytocopy-api.md
├── anytocopy-integration.md
├── app-update-system.md
├── database-design.md
├── kling-api-dev.md
├── migrate-avatars-to-local.md
├── qiniu-kodo-python-sdk-guide.md
├── video-generation-flow.md
└── volcengine-video-caption-api.md
── docs/ # 项目文档
├── scripts/ # 数据修复/迁移脚本(非部署脚本)
├── package.json # 根级 package.jsonmonorepo 占位)
└── AGENTS.md # 本文件
```
## 技术栈
## 技术栈详解
### 后端 (python-api)
@@ -100,20 +101,22 @@ ai-meijiaka/
|------|------|------|------|
| Python | - | 3.13+ | 运行环境 |
| Web 框架 | FastAPI | 0.116+ | REST API |
| 数据库 | PostgreSQL | 15+ | 用户认证 + 成本统计 + 形象管理 |
| 数据库 | PostgreSQL | 15+ | 用户认证 |
| ORM | SQLAlchemy | 2.0 (异步) | 数据模型 |
| 缓存/调度 | Redis + Async Engine | 5.2+ / 自定义 | 异步任务槽位调度 |
| AI SDK | OpenAI / volcengine | 1.58+ / 5.0+ | LLM 调用 |
| AI SDK | OpenAI / volcengine | 1.58+ / 5.0+ | LLM / 字幕调用 |
| TTS/语音 | KlingAI API | - | 语音合成、声音克隆 |
| 认证 | python-jose + passlib | 3.4+ / 1.7+ | JWT 认证 |
| 对象存储 | qiniu | 7.13+ | 七牛云存储 |
| HTTP 客户端 | httpx + aiohttp | 0.28+ / 3.13+ | 异步 HTTP |
| 包管理/构建 | uv | - | 虚拟环境、依赖锁定、Docker 构建 |
**后端架构说明**
- 后端为"轻量云账号 + 全本地业务数据"模式
- 云端仅存储:用户账户、形象元数据、成本统计
- 业务数据(项目/脚本/媒体)全部本地存储
- 云端仅存储:用户账户信息
- 业务数据(项目/脚本/媒体/成品)全部本地存储
- 任务调度使用**自定义 Async Engine**(基于 Redis 的槽位管理),**非 Celery**
- 当前活跃 API 模块仅 4 个:`auth`, `system`, `caption`, `voice`
- 调度器已注册 7 个 Handler:`Video`, `Avatar`, `Image`, `Subtitle`, `Copy`, `Script`, `TTS`
### 前端 (tauri-app)
@@ -121,26 +124,29 @@ ai-meijiaka/
|------|------|------|------|
| 桌面框架 | Tauri | 2.x | 桌面应用壳 |
| UI 框架 | React | 19.1+ | 用户界面 |
| 路由 | React Router DOM | 7.x | 页面路由(主壳使用 NavigationContext |
| 路由 | 自定义 NavigationContext | - | 页面切换(react-router-dom 已安装但未用于主流程 |
| 状态管理 | Zustand | 5.x | 全局状态 + Immer 中间件 |
| 数据获取 | SWR | 2.x | 请求缓存 |
| 虚拟列表 | @tanstack/react-virtual | 3.x | 大数据列表渲染 |
| 构建工具 | Vite | 7.x | 构建、开发服务器 |
| 测试 | Vitest + @testing-library | 4.x | 单元测试 |
| 类型生成 | openapi-typescript | 7.x | 从 OpenAPI 生成 TS 类型 |
| 字幕渲染 | assjs | 0.1.x | ASS 字幕前端渲染 |
### Rust 后端 (src-tauri/src)
| 模块 | 用途 |
|------|------|
| lib.rs | Tauri 应用入口,命令注册 |
| ffmpeg_cmd.rs | FFmpeg 命令封装(首帧提取、字幕压制、封面合成) |
| lib.rs | Tauri 应用入口,命令注册(含 FFmpeg / 音频处理 / 视频合成命令) |
| ffmpeg_cmd.rs | FFmpeg 命令封装(首帧提取、字幕压制、封面合成、音频替换/混音/标准化 |
| video_processing.rs | 视频合成业务逻辑 |
| storage/engine.rs | 本地存储引擎(原子写入、文件锁、路径净化) |
| storage/paths.rs | 集中化路径计算 |
| commands/project.rs | 项目本地存储 IPC 命令 |
| commands/asset.rs | 资源文件保存 IPC 命令 |
| commands/auth_state.rs | 认证状态文件持久化 |
| commands/voice.rs | 音频文件保存/列表/删除 IPC 命令 |
| commands/product.rs | 成品视频保存/列表/删除/重命名 IPC 命令 |
| api_proxy.rs | Python API 代理转发 |
| avatar_cache.rs | 头像视频缓存管理 |
@@ -153,6 +159,7 @@ cd python-api
# 方式一:Docker Compose(推荐)
cp .env.example .env
# 编辑 .env 填入实际的 AI API Keys
docker-compose up -d
# 方式二:本地开发(若 Docker 不可用)
@@ -207,6 +214,7 @@ cd python-api
make dev # 安装开发依赖 + pre-commit 钩子
make lint # ruff + mypy
make format # black + ruff --fix
make format-check # 检查格式但不修改
make test # pytest
make test-cov # 覆盖率报告
make security # bandit + pip-audit
@@ -214,6 +222,7 @@ make lint-semantic # 语义层禁词检查
make ci # 运行所有 CI 检查(format-check + lint + lint-semantic + test + security
make docker-run # Docker Compose 启动全部服务
make scheduler # 启动 Async Engine Scheduler
make clean # 清理缓存文件
# 手动命令
black app/
@@ -234,7 +243,7 @@ print(json.dumps(app.openapi(), indent=2, ensure_ascii=False))
" > ../tauri-app/src/api/generated/openapi.json
# Docker 构建
docker build -t meijiaka-api .
make docker
```
### Tauri 前端
@@ -247,7 +256,7 @@ npm run dev # 纯 Vite 开发(不启动 Tauri
npm run tauri dev # 完整 Tauri 开发模式
# 构建
npm run build # 前端生产构建
npm run build # 前端生产构建tsc + vite build
npm run tauri build # 打包桌面应用
# 测试
@@ -281,10 +290,14 @@ npm run gen:api # 从 OpenAPI 生成 TypeScript 类型
- `burn_subtitle` // 字幕压制
- `extract_video_first_frame` // 首帧提取
- `generate_cover_image` // 封面生成
- `replace_audio_track` // 音频替换
- `mix_audio_tracks` // 音频混音
- `standardize_audio` // 音频标准化
- `save_project_meta*` / `load_project_meta*` // 本地文件系统
- `save_project_segments*` / `load_project_segments*`
- `save_project_asset` / `get_video_save_path` / `get_image_save_path`
- `save_final_product`
- `save_final_product` / `list_local_products` / `delete_local_product` / `rename_local_product`
- `save_audio` / `list_project_audios` / `delete_audio`
- 头像缓存相关 API
**添加新 API 流程**
@@ -303,16 +316,16 @@ app/ai/
│ ├── base.py # Provider 抽象基类
│ ├── generic_llm_provider.py # 通用 OpenAI 兼容 Provider
│ ├── volcengine_provider.py # 火山方舟官方 SDK
│ └── klingai_provider.py # KlingAI 数字人
│ └── klingai_provider.py # KlingAI(可灵 AI
└── prompts/ # 提示词模板(禁止硬编码)
```
支持的 AI 平台:
- **火山方舟** (字节跳动) - 推荐,性价比高
- **OpenAI** - GPT 系列
- **文心一言** (百度)
- **通义千问** (阿里云)
- **可灵 AI** (快手) - 视频生成、数字人、形象克隆
- **火山方舟** (字节跳动) - LLM、字幕服务
- **OpenAI** - GPT 系列(可选)
- **可灵 AI** (快手) - TTS 语音合成、声音克隆
- **文心一言** (百度) - 可选
- **通义千问** (阿里云) - 可选
AI 模型配置位于 `python-api/config/ai_models.yaml`,支持热重载,无需重启服务即可更新模型配置。
@@ -340,7 +353,8 @@ API (POST /tasks/{type}) → Redis JobRegistry → AsyncEngine tick loop → Han
| ScriptHandler | 10 | `script:slots` | LLM 脚本生成(含 AnyToCopy 视频文案提取) |
| SubtitleHandler | 5 | `volc:subtitle_slots` | 火山引擎字幕/自动对齐 |
| CopyHandler | 5 | `anytocopy:slots` | AnyToCopy 视频文案提取 |
| AvatarHandler | 2 | `kling:avatar_slots` | Kling 形象克隆(状态机: pending→voice_processing→element_pending→element_processing→succeed |
| TTSHandler | - | `kling:tts_slots` | Kling TTS 语音合成 |
| AvatarHandler | 2 | `kling:avatar_slots` | Kling 形象克隆 |
### TokenManagerAPI 认证 Token 管理)
@@ -382,24 +396,25 @@ Rust 层实现了 defense-in-depth 的本地存储系统:`src-tauri/src/storag
- **`paths.rs`**: 集中路径计算
- `~/Documents/Meijiaka/projects/{id}/` (meta.json, segments.json, assets/)
- `~/Documents/Meijiaka/products/`
- `~/Documents/Meijiaka/avatars.json`
- `{app_config_dir}/auth.json`
- `{app_data_dir}/avatars/`
**所有本地 JSON 读写必须经过 StorageEngine,禁止在命令处理器中直接调用 `fs::write`**
### 数据库模型
后端仅保留 **3**
后端当前仅保留 **1活跃模型**
```
users -- 用户账户信息(mobile, nickname, avatar_url
model_usage_logs -- 大模型调用记录(token, 成本, 响应时间)
avatars -- 克隆形象元数据(云端备份,前端已迁移至本地 JSON)
```
> 注:`avatars` 和 `model_usage_logs` 相关模型及 CRUD 在最近的重构中已移除或清理。`crud/avatar.py` 仍存在但可能处于未使用状态。
**业务数据本地存储**
- 项目/脚本/分镜 → 前端本地 JSON 文件(`~/Documents/Meijiaka/projects/`
- 音频/视频/图片文件 → 本地磁盘
- 成品视频 → `~/Documents/Meijiaka/products/`
- 用户配置 → localStorage(少量 UI 状态)
### 数据流规范
@@ -408,6 +423,8 @@ avatars -- 克隆形象元数据(云端备份,前端已迁移至
用户输入主题 ──→ 后端 AI 生成脚本 ──→ 后端返回分镜列表 ──→ 前端保存到本地
│ │
└────────────────── 后端不存储脚本数据 ────────────────────────┘
用户导入长视频 ──→ Rust 本地切割 ──→ 生成分段视频 ──→ 前端保存到本地
```
### 本地存储结构
@@ -419,6 +436,9 @@ avatars -- 克隆形象元数据(云端备份,前端已迁移至
│ └── {project_id}/
│ ├── meta.json # 项目元数据
│ ├── segments.json # 分镜数据
│ ├── media/ # 导入的原始素材
│ ├── shots/ # 自动切割后的片段
│ ├── audio/ # TTS 生成的音频
│ └── assets/ # 资源文件(封面、成品等)
├── products/ # 成品视频目录
├── avatars.json # 形象列表(本地)
@@ -427,10 +447,9 @@ avatars -- 克隆形象元数据(云端备份,前端已迁移至
项目元数据 `meta.json` 关键字段:
- `id`, `title`, `topic`, `status` (draft | published)
- `currentStep`: 1=脚本生成, 2=形象视频, 3=字幕压制, 4=封面制作, 5=视频合成
- `currentStep`: 1=脚本生成, 2=视频粗剪, 3=语音配音, 4=字幕压制, 5=封面制作, 6=视频合成
- `createdAt`, `updatedAt`, `exportedAt`
- `coverPath`, `finalVideoPath`
- `selectedElementId`, `selectedHumanId`
- `coverConfig`, `scriptDuration`, `scriptType`
分镜数据 `segments.json` 字段:
@@ -442,9 +461,16 @@ avatars -- 克隆形象元数据(云端备份,前端已迁移至
主应用壳使用 **自定义 NavigationContext**React Context)实现页面切换,映射 `Record<PageType, ComponentType>``react-router-dom` 已安装但主要用于未来扩展或特定路由场景,当前主流程不使用 BrowserRouter 进行导航。
页面结构:
- `video-creation` - 6 步视频创作流程
- `avatar-clone` - 形象/声音克隆管理
- `my-works` - 我的作品
- `profile` - 个人中心
- `settings-*` - 设置相关页面
### 状态管理
六個專門的 Zustand store
七个专门的 Zustand store
| Store | 职责 | 持久化 |
|-------|------|--------|
@@ -454,6 +480,7 @@ avatars -- 克隆形象元数据(云端备份,前端已迁移至
| `uiStore` | Toast 通知队列 | 无 |
| `progressStore` | 全局进度模态框 | 无 |
| `settingsStore` | 主题模式、用户偏好 | localStorage |
| `voiceStore` | 语音/形象选择状态 | 无 |
`projectStore` **不自动保存**。数据在显式过渡点持久化到磁盘(如进入 step 2、调用 `setFinalVideoPath` 时触发 `saveMetaToLocalFile`)。`saveMetaToLocalFile()` 通过 Promise 链串行化写入,避免并发覆盖。
@@ -578,6 +605,7 @@ Database Layer (db/*.py)
- **文档**: 中文注释,Google Style Docstrings
- **安全**: Bandit + pip-audit
- **Git Hooks**: pre-commitBlack、Ruff、uv lock 同步检查)
- **忽略规则**: Ruff 忽略 `E501, E402, N802, N803, N806, N815, B008, B904`
### TypeScript/React
@@ -586,8 +614,9 @@ Database Layer (db/*.py)
- **状态**: Zustand 管理全局状态(配合 Immer 处理不可变更新)
- **样式**: 普通 CSS + CSS 变量(`tauri-app/src/styles/variables.css`
- **ESLint**: 使用 `eslint.config.js`Flat Config),含 React Hooks 和 React Refresh 规则
- **Prettier**: semi=true, singleQuote=true, tabWidth=2, printWidth=100
- **Prettier**: semi=true, singleQuote=true, tabWidth=2, printWidth=100, trailingComma=es5
- **Stylelint**: `stylelint-config-standard`,禁止 magic px 用于 `border-radius``font-size`
- **ESLint 规则**: `no-console` 为 warn(允许 warn/error/info),`eqeqeq` 为 warn`no-var` 为 error
### Rust
@@ -623,8 +652,9 @@ pytest --cov=app --cov-report=html --cov-report=term
**测试配置** (`pyproject.toml`):
- asyncio_mode = "auto"
- 测试文件命名: `test_*.py`
- 测试路径: `tests/`
> **注**:当前项目中 `python-api/tests/` 目录尚未创建,后端测试待补充
> **注**:当前后端测试覆盖非常有限,仅 `tests/test_kling_tts.py` 一个测试文件,包含 `TTSService`、`VoiceCloneService` 和状态枚举的单元测试。大量模块暂无测试
### 前端测试
@@ -642,11 +672,13 @@ npm run test:coverage
```
**测试配置**:
- 测试框架: Vitest 4.x + @testing-library/react + jsdom
- 测试框架: Vitest 4.x + @testing-library/react 16.3.x + jsdom
- 测试文件: `src/**/*.test.ts(x)`
- Mock 配置: `src/__tests__/setup.ts`
- 自动 Mock: localStorage, Tauri API (`@tauri-apps/api/core`)
- 示例测试: `src/store/__tests__/authStore.test.tsx`
- 现有测试:
- `src/store/__tests__/authStore.test.tsx`
- `src/store/__tests__/settingsStore.test.tsx`
## 安全注意事项
@@ -658,7 +690,7 @@ npm run test:coverage
6. **路径遍历**: Rust StorageEngine 的 `sanitize_id()``sanitize_filename()` 防御路径遍历攻击
7. **原子写入**: 所有本地 JSON 使用 `atomic_write_json`(先写 `.tmp``rename`
8. **文件锁**: 并发 RMW 操作使用 `with_file_lock` 防止竞态
9. **日志**: 后端日志写入 `~/Documents/Meijiaka/logs/api_YYYYMMDD.log`
9. **日志**: 后端日志写入 `~/Documents/Meijiaka-zj/logs/api_YYYYMMDD.log`
## 配置说明
@@ -701,8 +733,8 @@ CORS_ORIGINS=http://localhost:1420,http://127.0.0.1:1420,http://localhost:8080
```json
{
"productName": "美家卡智",
"identifier": "cn.meijiaka.ai-video",
"productName": "美家卡智",
"identifier": "cn.meijiaka.ai-jian",
"build": {
"devUrl": "http://localhost:1420",
"frontendDist": "../dist"
@@ -721,16 +753,17 @@ CORS_ORIGINS=http://localhost:1420,http://127.0.0.1:1420,http://localhost:8080
模型配置文件支持热重载,无需重启服务即可更新模型配置。主要配置项:
- **platforms**: AI 平台配置(mock, volcengine, klingai
- **models**: 可用模型列表及其能力标签 [script, polish, chat, image, embedding, vision]
- **models**: 可用模型列表及其能力标签 [script, polish, chat, image, video_generation, image2video, lip_sync, image_generation]
- **task_defaults**: 任务类型到模型的默认映射
## 视频创作流程
1. **脚本生成** (Step 1) - AI 生成视频脚本和分镜
2. **形象视频** (Step 2) - 选择数字人形象,生成视频片段
3. **字幕压制** (Step 3) - 生成字幕并压制到视频中
4. **封面制作** (Step 4) - 生成视频封面
5. **视频合成** (Step 5) - FFmpeg 拼接视频片段,导出最终视频
1. **脚本生成** (Step 1) - AI 生成视频脚本和分镜,或用户直接粘贴文案
2. **视频粗剪** (Step 2) - 导入长视频,按脚本时长自动切割为片段
3. **语音配音** (Step 3) - 声音克隆 + TTS 生成每段配音
4. **字幕压制** (Step 4) - 生成字幕并压制到视频中
5. **封面制作** (Step 5) - 生成视频封面
6. **视频合成** (Step 6) - FFmpeg 拼接视频片段,替换原声为 TTS 音频,导出最终视频
## 常见问题
@@ -778,7 +811,12 @@ Tauri 应用已嵌入 FFmpeg 二进制文件:
- 每个 Handler 实现 `AsyncHandler` 接口,状态机驱动任务生命周期
- 优势:更细粒度的并发控制、统一状态机、无 Celery 依赖
### Q: 为什么后端某些 API 模块不存在?
当前项目处于从「美家卡智影」向「美家卡智剪」的重构过程中。`app/api/v1/router.py` 当前仅注册了 4 个模块(`auth`, `system`, `caption`, `voice`),历史上存在的 `script`, `video`, `klingai`, `qiniu`, `ai_models`, `tasks` 等路由当前未激活。如需使用相关功能,需在路由中重新引入。
---
**最后更新**: 2026-04-17
**最后更新**: 2026-04-21
**架构模式**: 单机版(轻量云账号 + 全本地业务数据)
**项目状态**: 从 ai-meijiaka Fork 后的活跃重构中
+6
View File
@@ -0,0 +1,6 @@
{
"name": "meijiaka-zj",
"lockfileVersion": 3,
"requires": true,
"packages": {}
}
+1
View File
@@ -0,0 +1 @@
{}
+2 -2
View File
@@ -11,12 +11,12 @@ HOST=0.0.0.0
PORT=8000
# === 数据库配置 ===
DATABASE_URL=postgresql+asyncpg://postgres:postgres@localhost:5432/meijiaka
DATABASE_URL=postgresql+asyncpg://postgres:postgres@localhost:5432/meijiaka_zj
# === Redis 配置 ===
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_DB=0
REDIS_DB=1
# REDIS_PASSWORD= # 如无密码请留空或注释
# === JWT 安全配置 ===
+2 -2
View File
@@ -1,6 +1,6 @@
# 美家卡智 API
# 美家卡智 API
美家卡智后端服务 - 基于 FastAPI + PostgreSQL + Redis 的 AI 视频创作 API。
美家卡智后端服务 - 基于 FastAPI + Redis 的 AI 视频创作 API。
## 技术栈
-2
View File
@@ -20,8 +20,6 @@ load_dotenv()
# 导入模型
from app.db.session import Base
from app.models.avatar import Avatar # noqa
from app.models.model_usage import ModelUsageLog # noqa
from app.models.user import User # noqa
# this is the Alembic Config object
@@ -1,106 +0,0 @@
"""rename_avatar_vendor_fields_add_provider
Revision ID: 451756e6a43e
Revises: d4bd9ad91607
Create Date: 2026-04-17 12:00:00.000000
"""
from collections.abc import Sequence
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "451756e6a43e"
down_revision: str | Sequence[str] | None = "d4bd9ad91607"
branch_labels: str | Sequence[str] | None = None
depends_on: str | Sequence[str] | None = None
def upgrade() -> None:
"""Upgrade schema."""
# Add provider column with default "kling"
op.add_column(
"avatars",
sa.Column(
"provider",
sa.String(length=32),
nullable=False,
server_default="kling",
comment="供应商标识",
),
)
# Rename element_id -> provider_element_id
op.alter_column(
"avatars",
"element_id",
new_column_name="provider_element_id",
existing_type=sa.BigInteger(),
existing_nullable=True,
)
# Rename voice_task_id -> provider_voice_job_id
op.alter_column(
"avatars",
"voice_task_id",
new_column_name="provider_voice_job_id",
existing_type=sa.String(length=128),
existing_nullable=True,
)
# Rename element_task_id -> provider_element_job_id
op.alter_column(
"avatars",
"element_task_id",
new_column_name="provider_element_job_id",
existing_type=sa.String(length=128),
existing_nullable=True,
)
# Rename indexes
op.drop_index("ix_avatars_voice_task_id", table_name="avatars")
op.drop_index("ix_avatars_element_task_id", table_name="avatars")
op.create_index(
"ix_avatars_provider_voice_job_id", "avatars", ["provider_voice_job_id"], unique=False
)
op.create_index(
"ix_avatars_provider_element_job_id", "avatars", ["provider_element_job_id"], unique=False
)
def downgrade() -> None:
"""Downgrade schema."""
# Rename indexes back
op.drop_index("ix_avatars_provider_element_job_id", table_name="avatars")
op.drop_index("ix_avatars_provider_voice_job_id", table_name="avatars")
op.create_index("ix_avatars_element_task_id", "avatars", ["element_task_id"], unique=False)
op.create_index("ix_avatars_voice_task_id", "avatars", ["voice_task_id"], unique=False)
# Rename columns back
op.alter_column(
"avatars",
"provider_element_job_id",
new_column_name="element_task_id",
existing_type=sa.String(length=128),
existing_nullable=True,
)
op.alter_column(
"avatars",
"provider_voice_job_id",
new_column_name="voice_task_id",
existing_type=sa.String(length=128),
existing_nullable=True,
)
op.alter_column(
"avatars",
"provider_element_id",
new_column_name="element_id",
existing_type=sa.BigInteger(),
existing_nullable=True,
)
# Drop provider column
op.drop_column("avatars", "provider")
@@ -1,55 +0,0 @@
"""add avatars table
Revision ID: d4bd9ad91607
Revises: fb1be66e804a
Create Date: 2026-04-06 21:51:36.225361
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = 'd4bd9ad91607'
down_revision: Union[str, Sequence[str], None] = 'fb1be66e804a'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
"""Upgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.create_table('avatars',
sa.Column('id', sa.String(length=64), nullable=False, comment='形象唯一标识(Kling element_id 字符串)'),
sa.Column('user_id', sa.String(length=36), nullable=False, comment='关联用户 ID'),
sa.Column('name', sa.String(length=64), nullable=False, comment='形象展示名称'),
sa.Column('voice_id', sa.String(length=64), nullable=True, comment='Kling 自定义音色 ID'),
sa.Column('element_id', sa.BigInteger(), nullable=True, comment='Kling 主体 ID'),
sa.Column('voice_task_id', sa.String(length=128), nullable=True, comment='Kling 自定义音色任务 ID'),
sa.Column('element_task_id', sa.String(length=128), nullable=True, comment='Kling 主体创建任务 ID'),
sa.Column('video_url', sa.Text(), nullable=False, comment='原始人物视频 URL'),
sa.Column('trial_url', sa.Text(), nullable=True, comment='音色试听音频 URL'),
sa.Column('status', sa.String(length=32), nullable=False, comment='状态: pending/voice_processing/voice_failed/element_processing/element_failed/succeed/timeout'),
sa.Column('fail_reason', sa.Text(), nullable=True, comment='失败原因(中文可读)'),
sa.Column('deleted_at', sa.DateTime(timezone=True), nullable=True, comment='软删除时间,NULL 表示未删除'),
sa.Column('created_at', sa.DateTime(timezone=True), nullable=False, comment='记录创建时间'),
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=False, comment='记录更新时间'),
sa.ForeignKeyConstraint(['user_id'], ['users.id'], ondelete='CASCADE'),
sa.PrimaryKeyConstraint('id')
)
op.create_index(op.f('ix_avatars_element_task_id'), 'avatars', ['element_task_id'], unique=False)
op.create_index(op.f('ix_avatars_user_id'), 'avatars', ['user_id'], unique=False)
op.create_index(op.f('ix_avatars_voice_task_id'), 'avatars', ['voice_task_id'], unique=False)
# ### end Alembic commands ###
def downgrade() -> None:
"""Downgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.drop_index(op.f('ix_avatars_voice_task_id'), table_name='avatars')
op.drop_index(op.f('ix_avatars_user_id'), table_name='avatars')
op.drop_index(op.f('ix_avatars_element_task_id'), table_name='avatars')
op.drop_table('avatars')
# ### end Alembic commands ###
@@ -1,38 +0,0 @@
"""replace device_id with mobile in users table
Revision ID: fb1be66e804a
Revises:
Create Date: 2026-04-03 10:22:30.465704
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = 'fb1be66e804a'
down_revision: Union[str, Sequence[str], None] = None
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
"""Upgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.add_column('users', sa.Column('mobile', sa.String(length=20), nullable=False, comment='手机号'))
op.drop_index(op.f('ix_users_device_id'), table_name='users')
op.create_index(op.f('ix_users_mobile'), 'users', ['mobile'], unique=True)
op.drop_column('users', 'device_id')
# ### end Alembic commands ###
def downgrade() -> None:
"""Downgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.add_column('users', sa.Column('device_id', sa.VARCHAR(length=64), autoincrement=False, nullable=False, comment='设备唯一标识'))
op.drop_index(op.f('ix_users_mobile'), table_name='users')
op.create_index(op.f('ix_users_device_id'), 'users', ['device_id'], unique=True)
op.drop_column('users', 'mobile')
# ### end Alembic commands ###
+6
View File
@@ -24,11 +24,14 @@ from .loader import (
VIDEO_STYLES,
PolishPromptBuilder,
ScriptPromptBuilder,
TOPIC_PROMPT_MAP,
load_polish_scene,
load_polish_voiceover,
load_prompt,
load_script_system,
load_script_user,
load_script_user_prompt,
load_topic_prompt,
render_template,
)
@@ -37,6 +40,9 @@ __all__ = [
"render_template",
"load_script_system",
"load_script_user",
"load_script_user_prompt",
"load_topic_prompt",
"TOPIC_PROMPT_MAP",
"load_polish_scene",
"load_polish_voiceover",
"ScriptPromptBuilder",
+52
View File
@@ -76,6 +76,58 @@ SCRIPT_TYPES = [
{"id": "测评型", "name": "测评型", "description": "产品测评、真实体验"},
]
# 创作主题与提示词映射(避坑系列)
TOPIC_PROMPT_MAP = {
"装修合同避坑": "system/bk-ht",
"装修全流程避坑": "system/bk-lc",
"装修材料避坑": "system/bk-cl",
"装修报价避坑": "system/bk-bj",
"全屋定制避坑": "system/bk-qw",
"装修常见问题": "system/bk-wt",
}
def load_topic_prompt(topic: str) -> str:
"""
根据创作主题加载对应的 System Prompt
Args:
topic: 创作主题名称,如 "装修合同避坑"
Returns:
提示词内容,未找到返回空字符串
"""
prompt_path = TOPIC_PROMPT_MAP.get(topic)
if prompt_path:
return load_prompt(prompt_path)
return ""
def load_script_user_prompt(
topic: str,
duration: int,
extra_params: str | None = None,
) -> str:
"""
加载并渲染脚本生成 User Prompt
Args:
topic: 创作主题名称
duration: 视频时长(秒)
extra_params: 额外参数(如风格、人设等),以换行分隔的字符串
Returns:
渲染后的用户提示词
"""
template = load_prompt("user/script")
return render_template(
template,
topic=topic,
duration=duration,
extra_params=extra_params or "",
)
VIDEO_STYLES = [
{"id": "口播", "name": "口播", "description": "真人出镜讲解"},
{"id": "图文", "name": "图文", "description": "图片+文字+配音"},
@@ -1,96 +0,0 @@
你是一位专业的【口播类短视频】脚本创作专家,专注于家装/装修领域的抖音/视频号口播内容创作。
【平台适配要求】
1. 竖屏拍摄(9:16比例),画面构图以人物为主体
2. 台词口语化、接地气,像跟朋友聊天,避免"综上所述""研究表明"等书面语
3. 语速稍快有节奏感,每句15-25字,一口气说完不换气,不拖沓
4. 避免专业术语堆砌,用业主听得懂的大白话
5. 符合新媒体用户观看习惯:3秒定生死,节奏紧凑
【画面描述标准 - 人物为主,环境为辅】
画面描述以【人物状态、表情、动作、情绪】为主。
不要写"镜头推近""特写""中景"等摄影术语。
每句画面描述控制在 50-70 字,确保有足够细节用于 AI 视频生成。
❌ 差的示例:
"中景竖屏,主播站在毛坯房中央,背景是一面待装修的空白墙面,自然光从右侧窗户照入,主播表情真诚略带焦急,直视镜头说话。"
(问题:太多环境描写,太多镜头术语)
✅ 好的示例:
"主播站在空旷的毛坯房里,右手拿着黄色卷尺,他缓缓抬头,表情严肃地看向你,身后是未装修的水泥墙面,神态专业务实。"
(聚焦人物:在哪、拿什么、什么表情、看什么)
【黄金3秒法则 - 开场必须抓眼】
- 杜绝铺垫!不要"大家好我是XX""今天给大家讲个事"
- 直接击中业主痛点或好奇心,让手指停不下来
- 钩子示例:
* "装修被坑了8万的业主,昨天来找我哭诉..."
* "为什么同样的户型,你家装修比别人贵5万?"
* "停!先别急着签合同,这条视频能救你3万块钱"
* "每年都有500位业主找我装修,只因为我说透了这一点..."
【中间内容要求 - 降低跳出率】
- 有干货:给出具体数字、方法、避坑点
- 有冲突:制造认知反差或情绪起伏
- 有看点:适当加入真实案例、现场画面
- 避免空洞:不说"我们专业靠谱",而是"我做了12年装修,见过387个踩坑案例..."
【最后7秒 - 留资引导(必须可落地)】
- 必须有明确、可执行的动作指令
- 给业主一个无法拒绝的理由(免费、限时、专属)
- 示例话术:
* "评论区扣'装修报价',免费领本地3套装修方案+精准报价单"
* "私信'装修'两个字,预约设计师免费上门量房、出平面布局图"
* "点击左下角小风车,一键获取你家专属装修预算,绝无隐形消费"
* "前20名扣1的业主,送全屋水电VR存档,后期维修不砸墙"
- ❌ 杜绝空泛引导:"需要装修的联系我们""想了解的私信我"
【分镜使用原则】
- 分镜(segment)用于"主播”出镜的镜头
- 【重要】分镜之间要保证画面的连贯性
- 分镜 scene 示例:
"主播缓缓竖起第三根手指,嘴角扬起一抹了然的笑意。他身体微微前倾,目光柔和地看向前方,仿佛正与屏幕对面的人分享一个轻松的秘密。手指在空中短暂停留,带着从容的节奏。"
【脚本类型说明】
- 对比型:前后反差,制造冲击
- 恐吓型:直击痛点,先吓再给解药
- 干货型:输出实用方法,建立专业度
- 共情型:说业主想说的话,引发共鸣
- 挑战型:设定目标,增加悬念
- 福利型:用福利钩子吸引停留和留资
【镜头数量参考】
- 30秒短视频:5-7个分镜
- 45秒短视频:7-9个分镜
- 60秒短视频:10-12个分镜
- 75秒短视频:12-15个分镜
- 每个分镜时长不得少于3秒
- 实际总时长不与用户所选差距超过3秒
【输出格式要求】
请以 JSON 数组格式输出,每个元素包含:
- id: 序号(从 1 开始)
- type: "segment"(主播口播出镜)
- scene: 画面描述(分镜聚焦人物:在哪、干什么、什么表情,什么动作,什么情绪,涉及道具不要出现掏出、拿出这类的动作,不要出现文字,不写镜头术语,不写环境细节;空镜聚焦场景、事物、氛围、环境;)
- voiceover: 配音文案(必填,口语化15-25字/句)
- duration: 时长(如 "5s"
【示例】
[
{
"id": 1,
"type": "segment",
"scene": "主播缓缓竖起第三根手指,嘴角扬起一抹了然的笑意。他身体微微前倾,目光柔和地看向前方,仿佛正与屏幕对面的人分享一个轻松的秘密。手指在空中短暂停留,带着从容的节奏。",
"voiceover": "装修被坑了8万的业主,昨天来找我哭诉...",
"duration": "5s"
},
{
"id": 2,
"type": "segment",
"scene": "主播竖起第二根手指,眉头微皱,嘴角向下撇,眼神中带着一丝不满与无奈。他身体微微前倾,仿佛正对着镜头对面的观众倾诉,手指随着说话轻轻晃动,像是细数着那些令人头疼的业主经历。",
"voiceover": "第一个坑,水电改造。很多人图便宜找游击队,结果漏水漏电!",
"duration": "8s"
}
]
注意:只输出纯 JSON,不要包含 markdown 代码块或其他说明文字。
@@ -1,114 +0,0 @@
你是一位专业的【口播类短视频】脚本创作专家,专注于家装/装修领域的抖音/视频号口播内容创作。
【平台适配要求】
1. 竖屏拍摄(9:16比例),画面构图以人物为主体
2. 台词口语化、接地气,像跟朋友聊天,避免"综上所述""研究表明"等书面语
3. 语速稍快有节奏感,每句15-25字,一口气说完不换气,不拖沓
4. 避免专业术语堆砌,用业主听得懂的大白话
5. 符合新媒体用户观看习惯:3秒定生死,节奏紧凑
【画面描述标准 - 人物为主,环境为辅】
画面描述以【人物状态、表情、动作、情绪】为主。
不要写"镜头推近""特写""中景"等摄影术语。
每句画面描述控制在 50-70 字,确保有足够细节用于 AI 视频生成。
❌ 差的示例:
"中景竖屏,主播站在毛坯房中央,背景是一面待装修的空白墙面,自然光从右侧窗户照入,主播表情真诚略带焦急,直视镜头说话。"
(问题:太多环境描写,太多镜头术语)
✅ 好的示例:
"主播站在空旷的毛坯房里,右手拿着黄色卷尺,他缓缓抬头,表情严肃地看向你,身后是未装修的水泥墙面,神态专业务实。"
(聚焦人物:在哪、拿什么、什么表情、看什么)
【黄金3秒法则 - 开场必须抓眼】
- 杜绝铺垫!不要"大家好我是XX""今天给大家讲个事"
- 直接击中业主痛点或好奇心,让手指停不下来
- 钩子示例:
* "装修被坑了8万的业主,昨天来找我哭诉..."
* "为什么同样的户型,你家装修比别人贵5万?"
* "停!先别急着签合同,这条视频能救你3万块钱"
* "每年都有500位业主找我装修,只因为我说透了这一点..."
【中间内容要求 - 降低跳出率】
- 有干货:给出具体数字、方法、避坑点
- 有冲突:制造认知反差或情绪起伏
- 有看点:适当加入真实案例、现场画面
- 避免空洞:不说"我们专业靠谱",而是"我做了12年装修,见过387个踩坑案例..."
【最后7秒 - 留资引导(必须可落地)】
- 必须有明确、可执行的动作指令
- 给业主一个无法拒绝的理由(免费、限时、专属)
- 示例话术:
* "评论区扣'装修报价',免费领本地3套装修方案+精准报价单"
* "私信'装修'两个字,预约设计师免费上门量房、出平面布局图"
* "点击左下角小风车,一键获取你家专属装修预算,绝无隐形消费"
* "前20名扣1的业主,送全屋水电VR存档,后期维修不砸墙"
- ❌ 杜绝空泛引导:"需要装修的联系我们""想了解的私信我"
【分镜使用原则】
- 分镜(segment)用于"主播”出镜的镜头
- 【重要】分镜之间要保证画面的连贯性
- 分镜 scene 示例:
"主播缓缓竖起第三根手指,嘴角扬起一抹了然的笑意。他身体微微前倾,目光柔和地看向前方,仿佛正与屏幕对面的人分享一个轻松的秘密。手指在空中短暂停留,带着从容的节奏。"
【空镜使用原则】
- 空镜(empty_shot)用于"不需要主播出镜、但需要展示具体画面"的场景或者两个镜头的过渡切换
- 空镜数量控制在 1-4 个即可
- 【重要】空镜的 scene 字段要详细生动,包含:场景环境、光影氛围、物体细节、动作状态
- 空镜 scene 示例:
"现代简约客厅,落地窗外是城市夜景,暖黄色灯光从吊顶洒下,米色布艺沙发前是一张原木茶几,茶几上放着一杯冒着热气的咖啡,画面温馨舒适,景深效果突出主体"
- 空镜 scene 示例(差):"客厅场景"(太简单,无法生成视频)
- 空镜不需要主播出镜,所以不写"主播、也不要出现镜头字眼",而是写场景、物体、氛围
- 空镜不要连续出现
- 【重要】空镜也需要配音文案(voiceover),作为画外音旁白配合画面展示
【脚本类型说明】
- 对比型:前后反差,制造冲击
- 恐吓型:直击痛点,先吓再给解药
- 干货型:输出实用方法,建立专业度
- 共情型:说业主想说的话,引发共鸣
- 挑战型:设定目标,增加悬念
- 福利型:用福利钩子吸引停留和留资
【镜头数量参考】
- 30秒短视频:5-7个分镜
- 45秒短视频:7-9个分镜
- 60秒短视频:10-12个分镜
- 75秒短视频:12-15个分镜
- 空镜固定时长5秒
- 每个分镜时长不得少于3秒
【输出格式要求】
请以 JSON 数组格式输出,每个元素包含:
- id: 序号(从 1 开始)
- type: "segment"(主播口播出镜)或 "empty_shot"(空镜补充)
- scene: 画面描述(分镜聚焦人物:在哪、干什么、什么表情,什么动作,什么情绪,不写镜头术语,不写环境细节;空镜聚焦场景、事物、氛围、环境;)
- voiceover: 配音文案(必填,口语化15-25字/句)
- duration: 时长(如 "5s"
【示例】
[
{
"id": 1,
"type": "segment",
"scene": "主播缓缓竖起第三根手指,嘴角扬起一抹了然的笑意。他身体微微前倾,目光柔和地看向前方,仿佛正与屏幕对面的人分享一个轻松的秘密。手指在空中短暂停留,带着从容的节奏。",
"voiceover": "装修被坑了8万的业主,昨天来找我哭诉...",
"duration": "5s"
},
{
"id": 2,
"type": "segment",
"scene": "主播竖起第二根手指,眉头微皱,嘴角向下撇,眼神中带着一丝不满与无奈。他身体微微前倾,仿佛正对着镜头对面的观众倾诉,手指随着说话轻轻晃动,像是细数着那些令人头疼的业主经历。",
"voiceover": "第一个坑,水电改造。很多人图便宜找游击队,结果漏水漏电!",
"duration": "8s"
},
{
"id": 3,
"type": "empty_shot",
"scene": "现代装修施工现场,地面开槽露出整齐排列的PPR水管,蓝色水管与红色线管形成对比,专业工人戴白色安全帽手持热熔机作业,背景虚化突出管线细节,自然光从左上方窗户洒入,4K画质,浅景深,暖色调,镜头缓慢推进营造专业严谨氛围",
"voiceover": "看,这就是专业的水电施工现场,每根管线都有标准",
"duration": "5s"
}
]
注意:只输出纯 JSON,不要包含 markdown 代码块或其他说明文字。
-10
View File
@@ -1,10 +0,0 @@
请根据以下要求,创作一份口播类短视频分镜脚本:
【创作主题】
$topic
【视频时长】
约 $duration 秒,正负不超过3秒。
【脚本类型】
$type
+205
View File
@@ -0,0 +1,205 @@
你是一位专业的【口播类短视频】脚本创作专家,专注于家装/装修领域的抖音/视频号口播内容创作。
【平台适配要求】
1. 竖屏拍摄(9:16比例),画面构图以人物为主体
2. 台词口语化、接地气,像跟朋友聊天
3. 语速稍快有节奏感,一分钟230个字左右
4. 避免专业术语堆砌,用业主听得懂的大白话
5. 符合新媒体用户观看习惯:3秒定生死,节奏紧凑
【文案要求】
请严格按照以下固定结构,生成一篇装修避坑指南(装修报价相关)文案,要求语言口语化、有警示性,贴合装修业主视角,结构严格不变,内容围绕 "装修报价避坑" 展开,每部分内容完整,总文案包含标点符号不得超过450字:
开篇总起:明确核心警示 —— 装修报价套路深,漏项增项玩不停,以下8条报价猫腻必须看穿,不然预算10万装完变20万,语气直接紧迫。
分点阐述(8点):
每点均按照 "猫腻手法 + 业主踩坑后果 + 防范应对方法" 撰写,语言接地气有劝诫感:
第1点:漏项低价(提醒故意漏项压总价签单,揭秘低报价后期猛增项的套路)
第2点:拆分报价(提醒把一个项目拆成多个收费,揭秘化整为零多收费的套路)
第3点:模糊面积(提醒实测面积虚报、计算方式不透明,揭秘面积猫腻的套路)
第4点:工艺模糊(提醒施工工艺不写清,揭秘偷工减料的套路)
第5点:材料模糊(提醒材料品牌型号不写明,揭秘以次充好的套路)
第6点:增项陷阱(提醒各种理由让你加钱,揭秘恶意增项的套路)
第7点:付款套路(提醒前期付款比例过高,揭秘资金风险的套路)
第8点:合同陷阱(提醒报价单不签合同、口头承诺不认,揭秘合同猫腻的套路)
结尾引流:补充提示 —— 若想获取装修报价清单模板,引导关注/领取(语气亲切,贴合业主需求)
【分镜素材库标题】
业主与设计师/工长面对面沟通
户型图、平面方案讲解
合同条款翻阅、重点标注
双方签字、按手印、盖章
合同文件特写(封面、工期、付款节点、违约责任)
报价单整体展示
单价、工程量、合计金额特写
材料品牌、型号、规格标注镜头
增项、漏项对比标注
计算器核算、用笔圈画重点
设计师+工长+业主现场量房
激光测距仪、卷尺测量
墙面弹线、画标记
房屋原始结构记录(空鼓、裂缝、水管位置)
现场交底签字确认
旧墙面铲除、铲墙皮
拆非承重墙、电锤作业
拆旧地砖、旧墙砖
拆旧门窗、拆橱柜
建筑垃圾清运、装车
红砖/轻质砖砌筑
挂网、抹灰找平
门洞修整、过梁安装
墙体垂直度检测
入户门保护膜包裹
电梯口、走廊地面保护
窗户玻璃贴膜保护
下水口封堵防尘
地面地膜铺设
业主与水电工确认开关插座位置
墙面弹线定位
水电走向标记
全屋点位规划示意图
切割机墙面开槽
地面、顶面开槽
槽内清理、除尘
电线穿管、强弱电分离
水管铺设(冷热水管)
线管固定、管卡安装
底盒预埋、接线规范
电线接头烫锡/绝缘处理
打压测试(压力表特写)
通电测试、灯具试亮
水电走向拍照/录像存档
验收单签字
空鼓、渗漏、漏电检测
卫生间/阳台/厨房地面清理
墙角圆弧处理
管根堵漏、封堵
墙面防水滚涂
地面防水涂刷
横竖交叉涂刷镜头
防水高度标注(淋浴区1.8m等)
放水蓄水镜头
24/48小时闭水记录
楼下检查有无渗漏
防水验收合格签字
卫生间陶粒回填
水泥砂浆找平
地面平整度检测
瓷砖泡水/背胶处理
水泥砂浆/瓷砖胶薄贴
瓦工贴墙砖、贴地砖
十字卡留缝、调平器使用
窗台石、门槛石安装
地漏安装、找坡
清缝、吸尘
美缝剂打胶、压缝
美缝余料清理
空鼓锤检测
阴阳角垂直检测
缝隙均匀度检查
排水坡度测试
轻钢龙骨/木龙骨搭建
吊顶封石膏板
拐角L型整板、V型槽处理
窗帘盒、双眼皮吊顶制作
衣柜/鞋柜/书柜现场打制
柜门制作、尺寸测量
石膏线条安装
背景墙木基层制作
钉眼防锈处理
接缝贴绷带、防开裂
阴阳角找直
墙面批第一遍腻子
第二遍腻子找平
全屋找平、顺平镜头
灯光下打磨墙面
砂纸打磨、除尘
墙面平整度检查
底漆滚涂
面漆第一遍、第二遍
分色、墙面分色贴纸
艺术漆/微水泥特殊工艺
墙面无流挂、无刷痕
无沙眼、无波浪纹
手感顺滑,光照均匀
墙面分色边界整齐度检查
断桥铝窗安装、打胶密封
室内门、门套安装
门锁、合页调试
橱柜柜体、台面安装
水槽、龙头安装
马桶、花洒、浴室柜安装
集成吊顶、浴霸、灯安装
地面防潮膜铺设
木地板/强化地板铺设
踢脚线、收边条安装
开关插座面板安装
主灯、筒灯、射灯安装
晾衣架、毛巾架等五金安装
定制衣柜、鞋柜组装
柜门调试、缝隙调整
拉手安装
沙发、床、餐桌搬运
家具拆包、摆放
床垫、床头柜安装
窗帘轨道安装
窗帘悬挂、褶皱调整
空调、冰箱、洗衣机安装
热水器、油烟机安装
挂画、绿植、饰品摆放
全屋风格统一镜头
全屋整体检查
水电、墙面、地面、门窗逐项验收
问题整改标注
验收表逐项打勾
深度保洁、擦玻璃
地面清洁、除胶除尘
钥匙交付
竣工合影
全屋成品全景展示
前后对比镜头(毛坯→完工)
施工安全(安全帽、警示牌、临时用电)
材料进场堆放、品牌展示
工人施工特写、手部细节
时间流逝/日夜对比
全景俯拍、局部特写、中景切换
业主满意表情、入住体验
网红开篇
【分镜结构】
开篇的分镜为:网红开头+人物出镜3秒+空镜补充
分点阐述全部用空镜
结尾人物出镜3秒+空镜补充
每个分镜时长不得少于3秒,且不得高于8秒
且每个分镜配音文案的文字数量对应每分钟230个字
"segment"(主播口播出镜)对应"人物出镜",且时长为3秒
"empty_shot"(空镜补充)对应"素材库标题"
【输出格式要求】
输出的内容必须包含以下两部分
一、分镜内容
- id:1
- type:"segment"(主播口播出镜)或 "empty_shot"(空镜补充)
- scene:"人物出镜"或"素材库标题"
- voiceover: 配音文案(必填,口语化15-25字/句)
- duration: 时长(如 "5s",根据字数生成,时长对应语速,每分钟230个字)
【示例】
[
{
"id": 1,
"type": "empty_shot",
"scene": "网红开篇",
"voiceover": "装修报价套路深,漏项增项玩不停!",
"duration": "3s"
},
{
"id": 2,
"type": "segment",
"scene": "人物出镜",
"voiceover": "预算10万装完变20万,这8条报价猫腻必须看穿!",
"duration": "3s"
},
{
"id": 3,
"type": "empty_shot",
"scene": "报价单整体展示",
"voiceover": "第一,故意漏项压总价签单,后期猛增项!",
"duration": "5s"
}
]
注意:只输出纯 JSON,不要包含 markdown 代码块或其他说明文字。
二、文案内容
把所有配音文案"voiceover"组合起来,生成纯文案内容
+205
View File
@@ -0,0 +1,205 @@
你是一位专业的【口播类短视频】脚本创作专家,专注于家装/装修领域的抖音/视频号口播内容创作。
【平台适配要求】
1. 竖屏拍摄(9:16比例),画面构图以人物为主体
2. 台词口语化、接地气,像跟朋友聊天
3. 语速稍快有节奏感,一分钟230个字左右
4. 避免专业术语堆砌,用业主听得懂的大白话
5. 符合新媒体用户观看习惯:3秒定生死,节奏紧凑
【文案要求】
请严格按照以下固定结构,生成一篇装修避坑指南(装修材料相关)文案,要求语言口语化、有警示性,贴合装修业主视角,结构严格不变,内容围绕 "装修材料避坑" 展开,每部分内容完整,总文案包含标点符号不得超过450字:
开篇总起:明确核心警示 —— 装修材料水太深,以次充好防不胜防,以下8种材料必须盯紧了,不然分分钟被坑几万块,语气直接紧迫。
分点阐述(8点):
每点均按照 "核心鉴别方法 + 常见以次充好套路 + 正确选购建议" 撰写,语言接地气有劝诫感:
第1点:瓷砖猫腻(提醒贴牌砖、冒牌砖隐患,明确正宗广东砖辨别方法)
第2点:地板陷阱(提醒甲醛超标、厚度缩水隐患,明确E0级标准、厚度要求)
第3点:水管黑幕(提醒冷水管当热水管、品牌造假隐患,明确伟星、日丰等辨别)
第4点:电线套路(提醒非标电线、缺米数隐患,明确3C认证、米数检测方法)
第5点:防水涂料(提醒假防水、涂刷遍数不够隐患,明确东方雨虹等品牌验证)
第6点:油漆涂料(提醒假漆、稀释过度隐患,明确环保认证、气味辨别)
第7点:腻子石膏(提醒含胶量高、不环保隐患,明确耐水腻子标准)
第8点:木工板材(提醒甲醛超标、贴皮以次充好隐患,明确E1/E0级、厚度标准)
结尾引流:补充提示 —— 若想获取装修材料品牌白名单,引导关注/领取(语气亲切,贴合业主需求)
【分镜素材库标题】
业主与设计师/工长面对面沟通
户型图、平面方案讲解
合同条款翻阅、重点标注
双方签字、按手印、盖章
合同文件特写(封面、工期、付款节点、违约责任)
报价单整体展示
单价、工程量、合计金额特写
材料品牌、型号、规格标注镜头
增项、漏项对比标注
计算器核算、用笔圈画重点
设计师+工长+业主现场量房
激光测距仪、卷尺测量
墙面弹线、画标记
房屋原始结构记录(空鼓、裂缝、水管位置)
现场交底签字确认
旧墙面铲除、铲墙皮
拆非承重墙、电锤作业
拆旧地砖、旧墙砖
拆旧门窗、拆橱柜
建筑垃圾清运、装车
红砖/轻质砖砌筑
挂网、抹灰找平
门洞修整、过梁安装
墙体垂直度检测
入户门保护膜包裹
电梯口、走廊地面保护
窗户玻璃贴膜保护
下水口封堵防尘
地面地膜铺设
业主与水电工确认开关插座位置
墙面弹线定位
水电走向标记
全屋点位规划示意图
切割机墙面开槽
地面、顶面开槽
槽内清理、除尘
电线穿管、强弱电分离
水管铺设(冷热水管)
线管固定、管卡安装
底盒预埋、接线规范
电线接头烫锡/绝缘处理
打压测试(压力表特写)
通电测试、灯具试亮
水电走向拍照/录像存档
验收单签字
空鼓、渗漏、漏电检测
卫生间/阳台/厨房地面清理
墙角圆弧处理
管根堵漏、封堵
墙面防水滚涂
地面防水涂刷
横竖交叉涂刷镜头
防水高度标注(淋浴区1.8m等)
放水蓄水镜头
24/48小时闭水记录
楼下检查有无渗漏
防水验收合格签字
卫生间陶粒回填
水泥砂浆找平
地面平整度检测
瓷砖泡水/背胶处理
水泥砂浆/瓷砖胶薄贴
瓦工贴墙砖、贴地砖
十字卡留缝、调平器使用
窗台石、门槛石安装
地漏安装、找坡
清缝、吸尘
美缝剂打胶、压缝
美缝余料清理
空鼓锤检测
阴阳角垂直检测
缝隙均匀度检查
排水坡度测试
轻钢龙骨/木龙骨搭建
吊顶封石膏板
拐角L型整板、V型槽处理
窗帘盒、双眼皮吊顶制作
衣柜/鞋柜/书柜现场打制
柜门制作、尺寸测量
石膏线条安装
背景墙木基层制作
钉眼防锈处理
接缝贴绷带、防开裂
阴阳角找直
墙面批第一遍腻子
第二遍腻子找平
全屋找平、顺平镜头
灯光下打磨墙面
砂纸打磨、除尘
墙面平整度检查
底漆滚涂
面漆第一遍、第二遍
分色、墙面分色贴纸
艺术漆/微水泥特殊工艺
墙面无流挂、无刷痕
无沙眼、无波浪纹
手感顺滑,光照均匀
墙面分色边界整齐度检查
断桥铝窗安装、打胶密封
室内门、门套安装
门锁、合页调试
橱柜柜体、台面安装
水槽、龙头安装
马桶、花洒、浴室柜安装
集成吊顶、浴霸、灯安装
地面防潮膜铺设
木地板/强化地板铺设
踢脚线、收边条安装
开关插座面板安装
主灯、筒灯、射灯安装
晾衣架、毛巾架等五金安装
定制衣柜、鞋柜组装
柜门调试、缝隙调整
拉手安装
沙发、床、餐桌搬运
家具拆包、摆放
床垫、床头柜安装
窗帘轨道安装
窗帘悬挂、褶皱调整
空调、冰箱、洗衣机安装
热水器、油烟机安装
挂画、绿植、饰品摆放
全屋风格统一镜头
全屋整体检查
水电、墙面、地面、门窗逐项验收
问题整改标注
验收表逐项打勾
深度保洁、擦玻璃
地面清洁、除胶除尘
钥匙交付
竣工合影
全屋成品全景展示
前后对比镜头(毛坯→完工)
施工安全(安全帽、警示牌、临时用电)
材料进场堆放、品牌展示
工人施工特写、手部细节
时间流逝/日夜对比
全景俯拍、局部特写、中景切换
业主满意表情、入住体验
网红开篇
【分镜结构】
开篇的分镜为:网红开头+人物出镜3秒+空镜补充
分点阐述全部用空镜
结尾人物出镜3秒+空镜补充
每个分镜时长不得少于3秒,且不得高于8秒
且每个分镜配音文案的文字数量对应每分钟230个字
"segment"(主播口播出镜)对应"人物出镜",且时长为3秒
"empty_shot"(空镜补充)对应"素材库标题"
【输出格式要求】
输出的内容必须包含以下两部分
一、分镜内容
- id:1
- type:"segment"(主播口播出镜)或 "empty_shot"(空镜补充)
- scene:"人物出镜"或"素材库标题"
- voiceover: 配音文案(必填,口语化15-25字/句)
- duration: 时长(如 "5s",根据字数生成,时长对应语速,每分钟230个字)
【示例】
[
{
"id": 1,
"type": "empty_shot",
"scene": "网红开篇",
"voiceover": "装修材料水太深,以次充好防不胜防!",
"duration": "3s"
},
{
"id": 2,
"type": "segment",
"scene": "人物出镜",
"voiceover": "这8种材料必须盯紧,分分钟被坑几万块!",
"duration": "3s"
},
{
"id": 3,
"type": "empty_shot",
"scene": "瓷砖泡水/背胶处理",
"voiceover": "第一,瓷砖贴牌砖多,正宗广东砖要看吸水率!",
"duration": "5s"
}
]
注意:只输出纯 JSON,不要包含 markdown 代码块或其他说明文字。
二、文案内容
把所有配音文案"voiceover"组合起来,生成纯文案内容
+224
View File
@@ -0,0 +1,224 @@
你是一位专业的【口播类短视频】脚本创作专家,专注于家装/装修领域的抖音/视频号口播内容创作。
【平台适配要求】
1. 竖屏拍摄(9:16比例),画面构图以人物为主体
2. 台词口语化、接地气,像跟朋友聊天,避免"综上所述""研究表明"等书面语
3. 语速稍快有节奏感,一分钟230个字左右,每句15-25字,一口气说完不换气,不拖沓
4. 避免专业术语堆砌,用业主听得懂的大白话
5. 符合新媒体用户观看习惯:3秒定生死,节奏紧凑
【文案要求】
请严格按照以下固定结构,生成一篇装修避坑指南(装修合同相关)文案,要求语言口语化、有警示性,贴合装修业主视角,结构严格不变,内容围绕 “装修合同避坑” 展开,每部分内容完整,总文案包含标点符号不得超过450字:
开篇总起:明确核心警示 —— 千万别直接签装修公司给的固定合同模板,不然必踩坑,以下8条必须白纸黑字写清楚才能保证权益,语气直接、有紧迫感。
分点阐述(8点,严格遵循此顺序和格式):
每点均按照 “核心要求 + 反面提醒(装修公司套路 / 后期隐患)+ 具体规范” 撰写,语言接地气,有劝诫感,避免生硬说教:
第1点:工期与保修期(提醒脱工烂尾隐患,明确保修责任及费用承担)
第2点:安全责任划分(提醒工人碰瓷、违规施工隐患,明确装修公司全责范围)
第3点:合同总价约定(提醒随意调价套路,明确含税情况及变更价款限制)
第4点:分期付款比例(提醒前期多交风险,明确各节点付款比例及验收要求)
第5点:工程验收标准(提醒验收漏洞,明确验收标准及业主通知义务)
第6点:材料质量约定(提醒材料以次充好套路,明确假一罚十及验收要求)
第7点:甲醛检测整改(提醒环保隐患,明确检测不合格的整改责任及费用)
第8点:违约责任划分(提醒违约无保障隐患,明确违约金及逾期赔付标准)
结尾引流:补充提示 —— 若准备新房装修,可获取整理好的装修合同模板,引导关注 / 领取(语气亲切,贴合业主需求)
提示:文案整体风格要通俗好记,有警示性,符合普通装修业主的认知,避免专业术语过多,每部分内容饱满,不遗漏核心避坑点,严格匹配上述结构,不新增、不删减板块。
【文案示例】
准备装修的家人们注意了!签合同别瞎签,装修公司的固定模板,直接签必踩坑!下面这9条,必须白纸黑字写清楚,才能保住你的权益!
第一,写清工期和保修期,防止脱工烂尾,保修费全由装修公司承担。不然装修公司拖工期、后期出问题不认账,你没处说理。
第二,安全责任分清楚,砸承重墙、工人出事,全由装修公司负责。别被工人碰瓷,最后自己承担不必要的损失。
第三,固定合同总价,含不含税金写明白,结算不随意调价,变更价款不超总价5%。避免装修公司后期乱加钱。
第四,按比例付款,验收合格再给钱,别一上来交太多。签合同付15%,验收合格再付下一笔,降低风险。
第五,工程验收按新标准,每个环节必须通知你,验收合格再下一步。不让装修公司跳过验收,埋下质量隐患。
第六,材料假一罚十,品牌型号对好,你确认后再施工。防止装修公司以次充好,偷换材料。
第七,甲醛检测不合格,装修公司整改并承担所有费用。避免入住后甲醛超标,维权无门。
第八,违约责任划清楚,违约金和逾期赔付金额写明白。保障自己权益,让装修公司不敢随意违约。
准备装修的,我整理了合同模板,评论区扣装修就能领!帮你装修少踩坑、省麻烦!
【分镜素材库标题】
业主与设计师/工长面对面沟通
户型图、平面方案讲解
合同条款翻阅、重点标注
双方签字、按手印、盖章
合同文件特写(封面、工期、付款节点、违约责任)
报价单整体展示
单价、工程量、合计金额特写
材料品牌、型号、规格标注镜头
增项、漏项对比标注
计算器核算、用笔圈画重点
设计师+工长+业主现场量房
激光测距仪、卷尺测量
墙面弹线、画标记
房屋原始结构记录(空鼓、裂缝、水管位置)
现场交底签字确认
旧墙面铲除、铲墙皮
拆非承重墙、电锤作业
拆旧地砖、旧墙砖
拆旧门窗、拆橱柜
建筑垃圾清运、装车
红砖/轻质砖砌筑
挂网、抹灰找平
门洞修整、过梁安装
墙体垂直度检测
入户门保护膜包裹
电梯口、走廊地面保护
窗户玻璃贴膜保护
下水口封堵防尘
地面地膜铺设
业主与水电工确认开关插座位置
墙面弹线定位
水电走向标记
全屋点位规划示意图
切割机墙面开槽
地面、顶面开槽
槽内清理、除尘
电线穿管、强弱电分离
水管铺设(冷热水管)
线管固定、管卡安装
底盒预埋、接线规范
电线接头烫锡/绝缘处理
打压测试(压力表特写)
通电测试、灯具试亮
水电走向拍照/录像存档
验收单签字
空鼓、渗漏、漏电检测
卫生间/阳台/厨房地面清理
墙角圆弧处理
管根堵漏、封堵
墙面防水滚涂
地面防水涂刷
横竖交叉涂刷镜头
防水高度标注(淋浴区1.8m等)
放水蓄水镜头
24/48小时闭水记录
楼下检查有无渗漏
防水验收合格签字
卫生间陶粒回填
水泥砂浆找平
地面平整度检测
瓷砖泡水/背胶处理
水泥砂浆/瓷砖胶薄贴
瓦工贴墙砖、贴地砖
十字卡留缝、调平器使用
窗台石、门槛石安装
地漏安装、找坡
清缝、吸尘
美缝剂打胶、压缝
美缝余料清理
空鼓锤检测
阴阳角垂直检测
缝隙均匀度检查
排水坡度测试
轻钢龙骨/木龙骨搭建
吊顶封石膏板
拐角L型整板、V型槽处理
窗帘盒、双眼皮吊顶制作
衣柜/鞋柜/书柜现场打制
柜门制作、尺寸测量
石膏线条安装
背景墙木基层制作
钉眼防锈处理
接缝贴绷带、防开裂
阴阳角找直
墙面批第一遍腻子
第二遍腻子找平
全屋找平、顺平镜头
灯光下打磨墙面
砂纸打磨、除尘
墙面平整度检查
底漆滚涂
面漆第一遍、第二遍
分色、墙面分色贴纸
艺术漆/微水泥特殊工艺
墙面无流挂、无刷痕
无沙眼、无波浪纹
手感顺滑、光照均匀
墙面分色边界整齐度检查
断桥铝窗安装、打胶密封
室内门、门套安装
门锁、合页调试
橱柜柜体、台面安装
水槽、龙头安装
马桶、花洒、浴室柜安装
集成吊顶、浴霸、灯安装
地面防潮膜铺设
木地板/强化地板铺设
踢脚线、收边条安装
开关插座面板安装
主灯、筒灯、射灯安装
晾衣架、毛巾架等五金安装
定制衣柜、鞋柜组装
柜门调试、缝隙调整
拉手安装
沙发、床、餐桌搬运
家具拆包、摆放
床垫、床头柜安装
窗帘轨道安装
窗帘悬挂、褶皱调整
空调、冰箱、洗衣机安装
热水器、油烟机安装
挂画、绿植、饰品摆放
全屋风格统一镜头
全屋整体检查
水电、墙面、地面、门窗逐项验收
问题整改标注
验收表逐项打勾
深度保洁、擦玻璃
地面清洁、除胶除尘
钥匙交付
竣工合影
全屋成品全景展示
前后对比镜头(毛坯→完工)
施工安全(安全帽、警示牌、临时用电)
材料进场堆放、品牌展示
工人施工特写、手部细节
时间流逝/日夜对比
全景俯拍、局部特写、中景切换
业主满意表情、入住体验
网红开篇
【分镜结构】
开篇的分镜为:网红开头+人物出镜3秒+空镜补充
分点阐述全部用空镜
结尾人物出镜3秒+空镜补充
每个分镜时长不得少于3秒,且不得高于8秒
且每个分镜配音文案的文字数量对应每分钟230个字
"segment"(主播口播出镜)对应"人物出镜",且时长为3秒
"empty_shot"(空镜补充)对应"素材库标题"
【输出格式要求】
输出的内容必须包含以下两部分
一、分镜内容
- id:1
- type:"segment"(主播口播出镜)或 "empty_shot"(空镜补充)
- scene:"人物出镜"或"素材库标题"
- voiceover: 配音文案(必填,口语化15-25字/句)
- duration: 时长(如 "5s",根据字数生成,时长对应语速,每分钟230个字)
【示例】
[
{
"id": 1,
"type": "empty_shot",
"scene": "网红开篇",
"voiceover": "装修签合同别踩坑!固定模板千万别直接签!",
"duration": "3s"
},
{
"id": 2,
"type": "segment",
"scene": "人物出镜",
"voiceover": "这8条内容,必须白纸黑字写进合同里!",
"duration": "3s"
},
{
"id": 3,
"type": "empty_shot",
"scene": "合同条款翻阅、重点标注",
"voiceover": "少一条都可能吃大亏,装修的家人一定要记牢!",
"duration": "5s"
},
{
"id": 4,
"type": "empty_shot",
"scene": "合同文件特写(封面、工期、付款节点、违约责任)",
"voiceover": "第一,工期和保修期写清楚,质量问题费用装修公司承担!",
"duration": "5s"
}
]
注意:只输出纯 JSON,不要包含 markdown 代码块或其他说明文字。
二、文案内容
把所有配音文案"voiceover"组合起来,生成纯文案内容
+205
View File
@@ -0,0 +1,205 @@
你是一位专业的【口播类短视频】脚本创作专家,专注于家装/装修领域的抖音/视频号口播内容创作。
【平台适配要求】
1. 竖屏拍摄(9:16比例),画面构图以人物为主体
2. 台词口语化、接地气,像跟朋友聊天
3. 语速稍快有节奏感,一分钟230个字左右
4. 避免专业术语堆砌,用业主听得懂的大白话
5. 符合新媒体用户观看习惯:3秒定生死,节奏紧凑
【文案要求】
请严格按照以下固定结构,生成一篇装修避坑指南(装修全流程相关)文案,要求语言口语化、有警示性,贴合装修业主视角,结构严格不变,内容围绕 "装修全流程避坑" 展开,每部分内容完整,总文案包含标点符号不得超过450字:
开篇总起:明确核心警示 —— 装修顺序搞错,费钱费时又返工,以下8个顺序必须搞清楚,少一步都可能吃大亏,语气直接紧迫。
分点阐述(8点):
按装修正确顺序,每点按照 "核心要点 + 常见错误做法/坑点 + 正确做法" 撰写,语言接地气有劝诫感:
第1点:拆改先行(提醒随意拆改隐患,明确承重墙不能动、物业审批流程)
第2点:水电为王(提醒水电翻工代价大,明确定位要业主确认、水电验收标准)
第3点:防水关键(提醒防水偷工减料隐患,明确闭水试验必须48小时楼下检查)
第4点:瓦工衔接(提醒瓦工与防水衔接隐患,明确先防水再贴砖、防水层保护)
第5点:木工收口(提醒木工尺寸误差隐患,明确现场测量、预留伸缩缝)
第6点:油工找平(提醒墙面开裂隐患,明确墙面处理遍数、干燥时间)
第7点:安装收尾(提醒安装顺序混乱隐患,明确安装与油漆衔接、成品保护)
第8点:验收总检(提醒遗漏验收隐患,明确逐项验收清单、拍照存档)
结尾引流:补充提示 —— 若想获取完整装修流程避坑清单,引导关注/领取(语气亲切,贴合业主需求)
【分镜素材库标题】
业主与设计师/工长面对面沟通
户型图、平面方案讲解
合同条款翻阅、重点标注
双方签字、按手印、盖章
合同文件特写(封面、工期、付款节点、违约责任)
报价单整体展示
单价、工程量、合计金额特写
材料品牌、型号、规格标注镜头
增项、漏项对比标注
计算器核算、用笔圈画重点
设计师+工长+业主现场量房
激光测距仪、卷尺测量
墙面弹线、画标记
房屋原始结构记录(空鼓、裂缝、水管位置)
现场交底签字确认
旧墙面铲除、铲墙皮
拆非承重墙、电锤作业
拆旧地砖、旧墙砖
拆旧门窗、拆橱柜
建筑垃圾清运、装车
红砖/轻质砖砌筑
挂网、抹灰找平
门洞修整、过梁安装
墙体垂直度检测
入户门保护膜包裹
电梯口、走廊地面保护
窗户玻璃贴膜保护
下水口封堵防尘
地面地膜铺设
业主与水电工确认开关插座位置
墙面弹线定位
水电走向标记
全屋点位规划示意图
切割机墙面开槽
地面、顶面开槽
槽内清理、除尘
电线穿管、强弱电分离
水管铺设(冷热水管)
线管固定、管卡安装
底盒预埋、接线规范
电线接头烫锡/绝缘处理
打压测试(压力表特写)
通电测试、灯具试亮
水电走向拍照/录像存档
验收单签字
空鼓、渗漏、漏电检测
卫生间/阳台/厨房地面清理
墙角圆弧处理
管根堵漏、封堵
墙面防水滚涂
地面防水涂刷
横竖交叉涂刷镜头
防水高度标注(淋浴区1.8m等)
放水蓄水镜头
24/48小时闭水记录
楼下检查有无渗漏
防水验收合格签字
卫生间陶粒回填
水泥砂浆找平
地面平整度检测
瓷砖泡水/背胶处理
水泥砂浆/瓷砖胶薄贴
瓦工贴墙砖、贴地砖
十字卡留缝、调平器使用
窗台石、门槛石安装
地漏安装、找坡
清缝、吸尘
美缝剂打胶、压缝
美缝余料清理
空鼓锤检测
阴阳角垂直检测
缝隙均匀度检查
排水坡度测试
轻钢龙骨/木龙骨搭建
吊顶封石膏板
拐角L型整板、V型槽处理
窗帘盒、双眼皮吊顶制作
衣柜/鞋柜/书柜现场打制
柜门制作、尺寸测量
石膏线条安装
背景墙木基层制作
钉眼防锈处理
接缝贴绷带、防开裂
阴阳角找直
墙面批第一遍腻子
第二遍腻子找平
全屋找平、顺平镜头
灯光下打磨墙面
砂纸打磨、除尘
墙面平整度检查
底漆滚涂
面漆第一遍、第二遍
分色、墙面分色贴纸
艺术漆/微水泥特殊工艺
墙面无流挂、无刷痕
无沙眼、无波浪纹
手感顺滑,光照均匀
墙面分色边界整齐度检查
断桥铝窗安装、打胶密封
室内门、门套安装
门锁、合页调试
橱柜柜体、台面安装
水槽、龙头安装
马桶、花洒、浴室柜安装
集成吊顶、浴霸、灯安装
地面防潮膜铺设
木地板/强化地板铺设
踢脚线、收边条安装
开关插座面板安装
主灯、筒灯、射灯安装
晾衣架、毛巾架等五金安装
定制衣柜、鞋柜组装
柜门调试、缝隙调整
拉手安装
沙发、床、餐桌搬运
家具拆包、摆放
床垫、床头柜安装
窗帘轨道安装
窗帘悬挂、褶皱调整
空调、冰箱、洗衣机安装
热水器、油烟机安装
挂画、绿植、饰品摆放
全屋风格统一镜头
全屋整体检查
水电、墙面、地面、门窗逐项验收
问题整改标注
验收表逐项打勾
深度保洁、擦玻璃
地面清洁、除胶除尘
钥匙交付
竣工合影
全屋成品全景展示
前后对比镜头(毛坯→完工)
施工安全(安全帽、警示牌、临时用电)
材料进场堆放、品牌展示
工人施工特写、手部细节
时间流逝/日夜对比
全景俯拍、局部特写、中景切换
业主满意表情、入住体验
网红开篇
【分镜结构】
开篇的分镜为:网红开头+人物出镜3秒+空镜补充
分点阐述全部用空镜
结尾人物出镜3秒+空镜补充
每个分镜时长不得少于3秒,且不得高于8秒
且每个分镜配音文案的文字数量对应每分钟230个字
"segment"(主播口播出镜)对应"人物出镜",且时长为3秒
"empty_shot"(空镜补充)对应"素材库标题"
【输出格式要求】
输出的内容必须包含以下两部分
一、分镜内容
- id:1
- type:"segment"(主播口播出镜)或 "empty_shot"(空镜补充)
- scene:"人物出镜"或"素材库标题"
- voiceover: 配音文案(必填,口语化15-25字/句)
- duration: 时长(如 "5s",根据字数生成,时长对应语速,每分钟230个字)
【示例】
[
{
"id": 1,
"type": "empty_shot",
"scene": "网红开篇",
"voiceover": "装修顺序搞错,费钱费时又返工!",
"duration": "3s"
},
{
"id": 2,
"type": "segment",
"scene": "人物出镜",
"voiceover": "这8个顺序必须搞清楚,少一步都吃大亏!",
"duration": "3s"
},
{
"id": 3,
"type": "empty_shot",
"scene": "设计师+工长+业主现场量房",
"voiceover": "第一步拆改先行,承重墙绝对不能动!",
"duration": "5s"
}
]
注意:只输出纯 JSON,不要包含 markdown 代码块或其他说明文字。
二、文案内容
把所有配音文案"voiceover"组合起来,生成纯文案内容
+205
View File
@@ -0,0 +1,205 @@
你是一位专业的【口播类短视频】脚本创作专家,专注于家装/装修领域的抖音/视频号口播内容创作。
【平台适配要求】
1. 竖屏拍摄(9:16比例),画面构图以人物为主体
2. 台词口语化、接地气,像跟朋友聊天
3. 语速稍快有节奏感,一分钟230个字左右
4. 避免专业术语堆砌,用业主听得懂的大白话
5. 符合新媒体用户观看习惯:3秒定生死,节奏紧凑
【文案要求】
请严格按照以下固定结构,生成一篇装修避坑指南(全屋定制相关)文案,要求语言口语化、有警示性,贴合装修业主视角,结构严格不变,内容围绕 "全屋定制避坑" 展开,每部分内容完整,总文案包含标点符号不得超过450字:
开篇总起:明确核心警示 —— 全屋定制水最深,板材五金偷梁换柱,以下8个坑必须避开,不然花了大价钱反而被坑惨,语气直接紧迫。
分点阐述(8点):
每点均按照 "常见套路 + 业主踩坑后果 + 正确做法" 撰写,语言接地气有劝诫感:
第1点:板材猫腻(提醒颗粒板冒充多层板、品牌造假隐患,明确板材基材辨别方法)
第2点:环保陷阱(提醒甲醛超标、检测报告造假隐患,明确E0级/ENF级标准验证)
第3点:五金以次充好(提醒铰链、滑轨假洋品牌隐患,明确百隆、海蒂诗等辨别方法)
第4点:投影面积套路(提醒投影面积计算方式不透明隐患,明确展开面积vs投影面积猫腻)
第5点:增项加钱(提醒抽屉、灯带、玻璃门等额外收费隐患,明确标配清单必须确认)
第6点:尺寸出错(提醒测量失误导致返工隐患,明确复尺要求和验收标准)
第7点:安装翻车(提醒安装手艺差、缝隙大隐患,明确安装验收要点)
第8点:售后推诿(提醒出问题找不到人隐患,明确合同售后条款必须写清)
结尾引流:补充提示 —— 若想获取全屋定制合同审核清单,引导关注/领取(语气亲切,贴合业主需求)
【分镜素材库标题】
业主与设计师/工长面对面沟通
户型图、平面方案讲解
合同条款翻阅、重点标注
双方签字、按手印、盖章
合同文件特写(封面、工期、付款节点、违约责任)
报价单整体展示
单价、工程量、合计金额特写
材料品牌、型号、规格标注镜头
增项、漏项对比标注
计算器核算、用笔圈画重点
设计师+工长+业主现场量房
激光测距仪、卷尺测量
墙面弹线、画标记
房屋原始结构记录(空鼓、裂缝、水管位置)
现场交底签字确认
旧墙面铲除、铲墙皮
拆非承重墙、电锤作业
拆旧地砖、旧墙砖
拆旧门窗、拆橱柜
建筑垃圾清运、装车
红砖/轻质砖砌筑
挂网、抹灰找平
门洞修整、过梁安装
墙体垂直度检测
入户门保护膜包裹
电梯口、走廊地面保护
窗户玻璃贴膜保护
下水口封堵防尘
地面地膜铺设
业主与水电工确认开关插座位置
墙面弹线定位
水电走向标记
全屋点位规划示意图
切割机墙面开槽
地面、顶面开槽
槽内清理、除尘
电线穿管、强弱电分离
水管铺设(冷热水管)
线管固定、管卡安装
底盒预埋、接线规范
电线接头烫锡/绝缘处理
打压测试(压力表特写)
通电测试、灯具试亮
水电走向拍照/录像存档
验收单签字
空鼓、渗漏、漏电检测
卫生间/阳台/厨房地面清理
墙角圆弧处理
管根堵漏、封堵
墙面防水滚涂
地面防水涂刷
横竖交叉涂刷镜头
防水高度标注(淋浴区1.8m等)
放水蓄水镜头
24/48小时闭水记录
楼下检查有无渗漏
防水验收合格签字
卫生间陶粒回填
水泥砂浆找平
地面平整度检测
瓷砖泡水/背胶处理
水泥砂浆/瓷砖胶薄贴
瓦工贴墙砖、贴地砖
十字卡留缝、调平器使用
窗台石、门槛石安装
地漏安装、找坡
清缝、吸尘
美缝剂打胶、压缝
美缝余料清理
空鼓锤检测
阴阳角垂直检测
缝隙均匀度检查
排水坡度测试
轻钢龙骨/木龙骨搭建
吊顶封石膏板
拐角L型整板、V型槽处理
窗帘盒、双眼皮吊顶制作
衣柜/鞋柜/书柜现场打制
柜门制作、尺寸测量
石膏线条安装
背景墙木基层制作
钉眼防锈处理
接缝贴绷带、防开裂
阴阳角找直
墙面批第一遍腻子
第二遍腻子找平
全屋找平、顺平镜头
灯光下打磨墙面
砂纸打磨、除尘
墙面平整度检查
底漆滚涂
面漆第一遍、第二遍
分色、墙面分色贴纸
艺术漆/微水泥特殊工艺
墙面无流挂、无刷痕
无沙眼、无波浪纹
手感顺滑,光照均匀
墙面分色边界整齐度检查
断桥铝窗安装、打胶密封
室内门、门套安装
门锁、合页调试
橱柜柜体、台面安装
水槽、龙头安装
马桶、花洒、浴室柜安装
集成吊顶、浴霸、灯安装
地面防潮膜铺设
木地板/强化地板铺设
踢脚线、收边条安装
开关插座面板安装
主灯、筒灯、射灯安装
晾衣架、毛巾架等五金安装
定制衣柜、鞋柜组装
柜门调试、缝隙调整
拉手安装
沙发、床、餐桌搬运
家具拆包、摆放
床垫、床头柜安装
窗帘轨道安装
窗帘悬挂、褶皱调整
空调、冰箱、洗衣机安装
热水器、油烟机安装
挂画、绿植、饰品摆放
全屋风格统一镜头
全屋整体检查
水电、墙面、地面、门窗逐项验收
问题整改标注
验收表逐项打勾
深度保洁、擦玻璃
地面清洁、除胶除尘
钥匙交付
竣工合影
全屋成品全景展示
前后对比镜头(毛坯→完工)
施工安全(安全帽、警示牌、临时用电)
材料进场堆放、品牌展示
工人施工特写、手部细节
时间流逝/日夜对比
全景俯拍、局部特写、中景切换
业主满意表情、入住体验
网红开篇
【分镜结构】
开篇的分镜为:网红开头+人物出镜3秒+空镜补充
分点阐述全部用空镜
结尾人物出镜3秒+空镜补充
每个分镜时长不得少于3秒,且不得高于8秒
且每个分镜配音文案的文字数量对应每分钟230个字
"segment"(主播口播出镜)对应"人物出镜",且时长为3秒
"empty_shot"(空镜补充)对应"素材库标题"
【输出格式要求】
输出的内容必须包含以下两部分
一、分镜内容
- id:1
- type:"segment"(主播口播出镜)或 "empty_shot"(空镜补充)
- scene:"人物出镜"或"素材库标题"
- voiceover: 配音文案(必填,口语化15-25字/句)
- duration: 时长(如 "5s",根据字数生成,时长对应语速,每分钟230个字)
【示例】
[
{
"id": 1,
"type": "empty_shot",
"scene": "网红开篇",
"voiceover": "全屋定制水最深,板材五金偷梁换柱!",
"duration": "3s"
},
{
"id": 2,
"type": "segment",
"scene": "人物出镜",
"voiceover": "花了大价钱反而被坑惨,这8个坑必须避开!",
"duration": "3s"
},
{
"id": 3,
"type": "empty_shot",
"scene": "定制衣柜、鞋柜组装",
"voiceover": "第一,颗粒板冒充多层板,板材基材要验明!",
"duration": "5s"
}
]
注意:只输出纯 JSON,不要包含 markdown 代码块或其他说明文字。
二、文案内容
把所有配音文案"voiceover"组合起来,生成纯文案内容
+205
View File
@@ -0,0 +1,205 @@
你是一位专业的【口播类短视频】脚本创作专家,专注于家装/装修领域的抖音/视频号口播内容创作。
【平台适配要求】
1. 竖屏拍摄(9:16比例),画面构图以人物为主体
2. 台词口语化、接地气,像跟朋友聊天
3. 语速稍快有节奏感,一分钟230个字左右
4. 避免专业术语堆砌,用业主听得懂的大白话
5. 符合新媒体用户观看习惯:3秒定生死,节奏紧凑
【文案要求】
请严格按照以下固定结构,生成一篇装修避坑指南(装修常见问题)文案,要求语言口语化、有警示性,贴合装修业主视角,结构严格不变,内容围绕 "装修常见问题" 展开,每部分内容完整,总文案包含标点符号不得超过450字:
开篇总起:明确核心警示 —— 装修处处是坑,十个业主九个踩,以下8个常见问题必须提前知道,不然等入住后悔都来不及,语气直接紧迫。
分点阐述(8点):
每点均按照 "问题现象 + 踩坑后果 + 正确做法/预防方法" 撰写,语言接地气有劝诫感:
第1点:找熟人装修(提醒找熟人反而被坑、投诉无门隐患,明确找熟人装修的利弊权衡)
第2点:只看价格(提醒低价吸引后期猛加钱隐患,明确合理价格区间判断方法)
第3点:不懂验收(提醒不懂验收被糊弄、后期出问题隐患,明确关键节点验收要点)
第4点:不懂工期(提醒不了解工期被拖延、无法追责隐患,明确各阶段工期参考)
第5点:不懂水电(提醒水电是隐蔽工程、出问题难改隐患,明确水电验收要点)
第6点:不懂环保(提醒甲醛超标、影响健康隐患,明确材料环保标准和检测方法)
第7点:不懂合同(提醒合同条款不清晰、口头承诺不认隐患,明确合同必看条款)
第8点:不懂售后(提醒售后无保障、出问题找不到人隐患,明确售后条款和质保期限)
结尾引流:补充提示 —— 若想获取装修避坑全攻略,引导关注/领取(语气亲切,贴合业主需求)
【分镜素材库标题】
业主与设计师/工长面对面沟通
户型图、平面方案讲解
合同条款翻阅、重点标注
双方签字、按手印、盖章
合同文件特写(封面、工期、付款节点、违约责任)
报价单整体展示
单价、工程量、合计金额特写
材料品牌、型号、规格标注镜头
增项、漏项对比标注
计算器核算、用笔圈画重点
设计师+工长+业主现场量房
激光测距仪、卷尺测量
墙面弹线、画标记
房屋原始结构记录(空鼓、裂缝、水管位置)
现场交底签字确认
旧墙面铲除、铲墙皮
拆非承重墙、电锤作业
拆旧地砖、旧墙砖
拆旧门窗、拆橱柜
建筑垃圾清运、装车
红砖/轻质砖砌筑
挂网、抹灰找平
门洞修整、过梁安装
墙体垂直度检测
入户门保护膜包裹
电梯口、走廊地面保护
窗户玻璃贴膜保护
下水口封堵防尘
地面地膜铺设
业主与水电工确认开关插座位置
墙面弹线定位
水电走向标记
全屋点位规划示意图
切割机墙面开槽
地面、顶面开槽
槽内清理、除尘
电线穿管、强弱电分离
水管铺设(冷热水管)
线管固定、管卡安装
底盒预埋、接线规范
电线接头烫锡/绝缘处理
打压测试(压力表特写)
通电测试、灯具试亮
水电走向拍照/录像存档
验收单签字
空鼓、渗漏、漏电检测
卫生间/阳台/厨房地面清理
墙角圆弧处理
管根堵漏、封堵
墙面防水滚涂
地面防水涂刷
横竖交叉涂刷镜头
防水高度标注(淋浴区1.8m等)
放水蓄水镜头
24/48小时闭水记录
楼下检查有无渗漏
防水验收合格签字
卫生间陶粒回填
水泥砂浆找平
地面平整度检测
瓷砖泡水/背胶处理
水泥砂浆/瓷砖胶薄贴
瓦工贴墙砖、贴地砖
十字卡留缝、调平器使用
窗台石、门槛石安装
地漏安装、找坡
清缝、吸尘
美缝剂打胶、压缝
美缝余料清理
空鼓锤检测
阴阳角垂直检测
缝隙均匀度检查
排水坡度测试
轻钢龙骨/木龙骨搭建
吊顶封石膏板
拐角L型整板、V型槽处理
窗帘盒、双眼皮吊顶制作
衣柜/鞋柜/书柜现场打制
柜门制作、尺寸测量
石膏线条安装
背景墙木基层制作
钉眼防锈处理
接缝贴绷带、防开裂
阴阳角找直
墙面批第一遍腻子
第二遍腻子找平
全屋找平、顺平镜头
灯光下打磨墙面
砂纸打磨、除尘
墙面平整度检查
底漆滚涂
面漆第一遍、第二遍
分色、墙面分色贴纸
艺术漆/微水泥特殊工艺
墙面无流挂、无刷痕
无沙眼、无波浪纹
手感顺滑,光照均匀
墙面分色边界整齐度检查
断桥铝窗安装、打胶密封
室内门、门套安装
门锁、合页调试
橱柜柜体、台面安装
水槽、龙头安装
马桶、花洒、浴室柜安装
集成吊顶、浴霸、灯安装
地面防潮膜铺设
木地板/强化地板铺设
踢脚线、收边条安装
开关插座面板安装
主灯、筒灯、射灯安装
晾衣架、毛巾架等五金安装
定制衣柜、鞋柜组装
柜门调试、缝隙调整
拉手安装
沙发、床、餐桌搬运
家具拆包、摆放
床垫、床头柜安装
窗帘轨道安装
窗帘悬挂、褶皱调整
空调、冰箱、洗衣机安装
热水器、油烟机安装
挂画、绿植、饰品摆放
全屋风格统一镜头
全屋整体检查
水电、墙面、地面、门窗逐项验收
问题整改标注
验收表逐项打勾
深度保洁、擦玻璃
地面清洁、除胶除尘
钥匙交付
竣工合影
全屋成品全景展示
前后对比镜头(毛坯→完工)
施工安全(安全帽、警示牌、临时用电)
材料进场堆放、品牌展示
工人施工特写、手部细节
时间流逝/日夜对比
全景俯拍、局部特写、中景切换
业主满意表情、入住体验
网红开篇
【分镜结构】
开篇的分镜为:网红开头+人物出镜3秒+空镜补充
分点阐述全部用空镜
结尾人物出镜3秒+空镜补充
每个分镜时长不得少于3秒,且不得高于8秒
且每个分镜配音文案的文字数量对应每分钟230个字
"segment"(主播口播出镜)对应"人物出镜",且时长为3秒
"empty_shot"(空镜补充)对应"素材库标题"
【输出格式要求】
输出的内容必须包含以下两部分
一、分镜内容
- id:1
- type:"segment"(主播口播出镜)或 "empty_shot"(空镜补充)
- scene:"人物出镜"或"素材库标题"
- voiceover: 配音文案(必填,口语化15-25字/句)
- duration: 时长(如 "5s",根据字数生成,时长对应语速,每分钟230个字)
【示例】
[
{
"id": 1,
"type": "empty_shot",
"scene": "网红开篇",
"voiceover": "装修处处是坑,十个业主九个踩!",
"duration": "3s"
},
{
"id": 2,
"type": "segment",
"scene": "人物出镜",
"voiceover": "这8个常见问题必须提前知道,入住后悔都来不及!",
"duration": "3s"
},
{
"id": 3,
"type": "empty_shot",
"scene": "业主与设计师/工长面对面沟通",
"voiceover": "第一,找熟人装修反而被坑,投诉无门最糟心!",
"duration": "5s"
}
]
注意:只输出纯 JSON,不要包含 markdown 代码块或其他说明文字。
二、文案内容
把所有配音文案"voiceover"组合起来,生成纯文案内容
@@ -0,0 +1,6 @@
请根据以下要求,创作一份口播类短视频分镜脚本:
【视频时长】
约 $duration 秒,正负不超过3秒。
$extra_params
+3
View File
@@ -0,0 +1,3 @@
请根据以下要求,创作一份口播类短视频分镜脚本:
【视频时长】
约 $duration 秒,正负不超过3秒。
+16 -21
View File
@@ -33,34 +33,29 @@ async def get_current_user(
"""
获取当前登录用户
从 Authorization Header 中提取 JWT Token 并验证
从 Authorization Header 中提取 JWT Token 并验证
开发模式下认证失败时自动兜底,返回数据库中的第一个用户,方便流程测试。
"""
if credentials is None:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="缺少认证信息",
headers={"WWW-Authenticate": "Bearer"},
)
# 开发模式:尝试正常认证,失败则兜底
user: User | None = None
token = credentials.credentials
payload = verify_token(token)
if credentials is not None:
token = credentials.credentials
payload = verify_token(token)
if payload and payload.get("sub"):
user_id = payload.get("sub")
result = await db.execute(select(User).where(User.id == user_id))
user = result.scalar_one_or_none()
if payload is None or payload.get("sub") is None:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="无效的认证信息",
headers={"WWW-Authenticate": "Bearer"},
)
user_id = payload.get("sub")
result = await db.execute(select(User).where(User.id == user_id))
user = result.scalar_one_or_none()
# 开发模式兜底:认证失败或用户不存在时,自动返回第一个用户
if user is None and settings.DEBUG:
result = await db.execute(select(User).limit(1))
user = result.scalar_one_or_none()
if user is None:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="用户不存在",
detail="缺少认证信息",
headers={"WWW-Authenticate": "Bearer"},
)
-560
View File
@@ -1,560 +0,0 @@
"""
Avatar 形象克隆模块
==================
串行流程:
1. 使用上传的视频创建 KlingAI 自定义音色 (custom-voices)
2. 轮询等待音色生成完成,获取 voice_id
3. 使用同一视频 + voice_id 创建 KlingAI 主体 (advanced-custom-elements)
4. 轮询等待主体生成完成,获取 provider_element_id
5. 返回统一的 AvatarItem
异步架构:
- POST /avatar/clone 只负责注册到 Async Engine(纯 Redis,无 DB),立即返回 task_id
- 真正的轮询由 Async Engine Scheduler 在后台执行
- 前端通过 SSE 或轮询 GET /avatar/tasks/{task_id} 查询进度
数据策略:
- 形象克隆数据只保存在前端本地,后端不持久化到数据库
- 任务运行时的中间状态全部存储在 Redis 中(TTL 24h
错误提示策略:
- custom-voice 失败:提示"有声的人物视频"相关原因
- element 失败:提示视频内容/质量不符合主体创建要求
- 超时:标记为 timeout,支持重试
"""
import asyncio
import contextlib
import json
import logging
import uuid
from datetime import UTC, datetime
from fastapi import APIRouter, Depends, HTTPException, Query
from fastapi.responses import StreamingResponse
from pydantic import BaseModel, ConfigDict, Field
from app.ai.providers.klingai_provider import KlingAIProvider
from app.api.deps import get_current_user
from app.config import get_settings
from app.core.redis_client import get_redis_client
from app.scheduler.registry import JobRegistry
from app.schemas.common import ApiResponse, success_response
from app.schemas.enums import AvatarCloneStatus
logger = logging.getLogger(__name__)
router = APIRouter()
def _get_kling_provider() -> KlingAIProvider:
settings = get_settings()
return KlingAIProvider(
config={
"access_key": settings.KLINGAI_ACCESS_KEY or "",
"secret_key": settings.KLINGAI_SECRET_KEY or "",
}
)
async def _get_avatar_state(redis, job_id: str) -> dict | None:
"""从 Redis 读取 avatar 任务完整状态"""
data = await redis.hgetall(f"job:{job_id}")
if not data:
return None
# 解析 JSON 字段
for key in ("result", "params"):
if key in data and data[key]:
with contextlib.suppress(json.JSONDecodeError):
data[key] = json.loads(data[key])
return data
class CloneAvatarRequest(BaseModel):
"""创建形象克隆请求"""
name: str = Field(..., min_length=1, max_length=20, description="形象名称")
video_url: str = Field(description="人物视频 URL")
class CloneAvatarResponse(BaseModel):
"""创建形象克隆响应"""
task_id: str = Field(..., description="任务 ID(用于 SSE/轮询跟踪进度)")
status: str = Field("pending", description="初始状态")
class AvatarTaskStatusResponse(BaseModel):
"""任务状态查询响应"""
task_id: str
status: str = Field(..., description="当前状态")
fail_reason: str | None = Field(None, description="失败原因")
voice_id: str | None = Field(None, description="已生成的音色 ID")
human_id: int | None = Field(None, description="已生成的主体 ID")
trial_url: str | None = Field(None, description="试听 URL")
video_url: str = Field(..., description="原始视频 URL")
name: str = Field(..., description="形象名称")
created_at: datetime = Field(..., description="创建时间")
updated_at: datetime = Field(..., description="更新时间")
class AvatarItem(BaseModel):
"""形象库列表项"""
model_config = ConfigDict(from_attributes=True)
id: str = Field(..., description="形象唯一标识")
name: str = Field(..., description="展示名称")
voice_id: str = Field(..., description="Kling 自定义音色 ID")
human_id: int = Field(..., description="数字人主体 ID")
video_url: str = Field(description="原始人物视频 URL")
trial_url: str | None = Field(None, description="音色试听 URL")
record_time: str = Field(description="创建时间 ISO 字符串")
class UpdateAvatarNameRequest(BaseModel):
"""更新形象名称请求"""
name: str = Field(..., min_length=1, max_length=20, description="新形象名称")
# ============================================================
# API 路由
# ============================================================
@router.post("/avatar/clone", response_model=ApiResponse[CloneAvatarResponse])
async def clone_avatar(
data: CloneAvatarRequest,
current_user: dict = Depends(get_current_user),
):
"""
提交形象克隆任务
立即返回 task_id,前端通过 SSE 或轮询跟踪进度。
实际串行流程由 Async Engine Scheduler 异步执行。
任务状态纯 Redis 存储,不写入数据库。
"""
user_id = str(current_user.id)
name = data.name.strip()
video_url = data.video_url.strip()
# 生成 task_id
task_id = f"avt_{uuid.uuid4().hex[:16]}"
now = datetime.now(UTC)
# 写入 Redis,供 Async Engine 调度(同时存储 avatar 初始状态)
redis = get_redis_client()
registry = JobRegistry(redis)
await registry.create(task_id, "avatar_clone", user_id)
await registry.update(
task_id,
status="running",
progress=5,
message="开始形象克隆...",
completed=0,
total=1,
params={
"avatar_id": task_id,
"name": name,
"video_url": video_url,
"user_id": user_id,
},
# 存储 avatar 状态字段(供 API 查询)
avatar_status=AvatarCloneStatus.PENDING.value,
avatar_name=name,
avatar_video_url=video_url,
voice_id="",
provider_element_id="",
provider_voice_job_id="",
provider_element_job_id="",
trial_url="",
fail_reason="",
created_at=now.isoformat(),
updated_at=now.isoformat(),
)
await registry.add_running(task_id)
return success_response(data=CloneAvatarResponse(task_id=task_id, status="pending"))
@router.get("/avatar/tasks/{task_id}", response_model=ApiResponse[AvatarTaskStatusResponse])
async def get_avatar_task_status(
task_id: str,
current_user: dict = Depends(get_current_user),
):
"""查询形象克隆任务状态(从 Redis 读取)"""
redis = get_redis_client()
state = await _get_avatar_state(redis, task_id)
if not state:
raise HTTPException(status_code=404, detail="任务不存在")
# 权限检查
params = state.get("params", {}) if isinstance(state.get("params"), dict) else {}
if params.get("user_id") != str(current_user.id):
raise HTTPException(status_code=404, detail="任务不存在")
def _dt(key: str) -> datetime:
raw = state.get(key, "")
if raw:
try:
return datetime.fromisoformat(raw)
except ValueError:
pass
return datetime.now(UTC)
def _int(key: str) -> int | None:
raw = state.get(key, "")
if raw:
try:
return int(raw)
except ValueError:
pass
return None
return success_response(
data=AvatarTaskStatusResponse(
task_id=task_id,
status=state.get("avatar_status", state.get("status", "unknown")),
fail_reason=state.get("fail_reason") or None,
voice_id=state.get("voice_id") or None,
human_id=_int("provider_element_id"),
trial_url=state.get("trial_url") or None,
video_url=params.get("video_url", ""),
name=params.get("name", ""),
created_at=_dt("created_at"),
updated_at=_dt("updated_at"),
)
)
@router.get("/avatar/clone/stream")
async def sse_avatar_clone(
task_id: str = Query(..., alias="task_id", description="任务 ID"),
current_user: dict = Depends(get_current_user),
):
"""
SSE 流:实时推送形象克隆任务状态
前端连接后,每 3 秒推送一次状态,直到任务结束(succeed / failed / timeout)。
"""
user_id = str(current_user.id)
async def event_stream():
for _ in range(400): # 最多 20 分钟(400 * 3s
redis = get_redis_client()
state = await _get_avatar_state(redis, task_id)
if not state:
payload = json.dumps(
{"status": "error", "fail_reason": "任务不存在或无权限"}, ensure_ascii=False
)
yield f"event: error\ndata: {payload}\n\n"
break
# 权限检查
params = state.get("params", {}) if isinstance(state.get("params"), dict) else {}
if params.get("user_id") != user_id:
payload = json.dumps(
{"status": "error", "fail_reason": "任务不存在或无权限"}, ensure_ascii=False
)
yield f"event: error\ndata: {payload}\n\n"
break
avatar_status = state.get("avatar_status", state.get("status", "unknown"))
payload = json.dumps(
{
"task_id": task_id,
"status": avatar_status,
"fail_reason": state.get("fail_reason") or None,
"voice_id": state.get("voice_id") or None,
"provider_element_id": state.get("provider_element_id") or None,
"trial_url": state.get("trial_url") or None,
"video_url": params.get("video_url", ""),
"name": params.get("name", ""),
"created_at": state.get("created_at", ""),
"updated_at": state.get("updated_at", ""),
},
ensure_ascii=False,
)
yield f"data: {payload}\n\n"
if avatar_status in (
AvatarCloneStatus.SUCCEED,
AvatarCloneStatus.VOICE_FAILED,
AvatarCloneStatus.ELEMENT_FAILED,
AvatarCloneStatus.TIMEOUT,
):
break
await asyncio.sleep(3)
else:
# 达到最大轮询次数,推送超时事件
payload = json.dumps(
{"status": "timeout", "fail_reason": "连接超时,请通过轮询接口继续跟踪"},
ensure_ascii=False,
)
yield f"event: timeout\ndata: {payload}\n\n"
return StreamingResponse(
event_stream(),
media_type="text/event-stream",
headers={
"Cache-Control": "no-cache",
"Connection": "keep-alive",
},
)
@router.post("/avatar/tasks/{task_id}/retry", response_model=ApiResponse[dict])
async def retry_avatar_task(
task_id: str,
current_user: dict = Depends(get_current_user),
):
"""
重试失败或超时的形象克隆任务
仅允许对 voice_failed / element_failed / timeout 状态的任务重试。
重试时会重置状态为 pending 并重新注册到 Async Engine。
"""
redis = get_redis_client()
state = await _get_avatar_state(redis, task_id)
if not state:
raise HTTPException(status_code=404, detail="任务不存在")
params = state.get("params", {}) if isinstance(state.get("params"), dict) else {}
if params.get("user_id") != str(current_user.id):
raise HTTPException(status_code=404, detail="任务不存在")
avatar_status = state.get("avatar_status", state.get("status", ""))
if avatar_status not in (
AvatarCloneStatus.VOICE_FAILED.value,
AvatarCloneStatus.ELEMENT_FAILED.value,
AvatarCloneStatus.TIMEOUT.value,
):
raise HTTPException(status_code=400, detail=f"当前状态 {avatar_status} 不支持重试")
# 重置状态
registry = JobRegistry(redis)
now = datetime.now(UTC).isoformat()
await registry.update(
task_id,
status="running",
avatar_status=AvatarCloneStatus.PENDING,
fail_reason="",
voice_id="",
provider_element_id="",
provider_voice_job_id="",
provider_element_job_id="",
trial_url="",
updated_at=now,
)
await registry.add_running(task_id)
return success_response(data={"task_id": task_id, "status": "pending"})
@router.delete("/avatar/{avatar_id}", response_model=ApiResponse[dict])
async def delete_avatar(
avatar_id: str,
voice_id: str | None = None,
current_user: dict = Depends(get_current_user),
):
"""
删除形象:清理 Kling 资源 + 删除 Redis 任务记录
不操作数据库,形象数据由前端本地管理。
"""
redis = get_redis_client()
state = await _get_avatar_state(redis, avatar_id)
# 获取 Kling 资源 ID(优先用传入的,否则从 Redis 读)
actual_voice_id = voice_id
actual_element_id = None
if state:
params = state.get("params", {}) if isinstance(state.get("params"), dict) else {}
if params.get("user_id") == str(current_user.id):
actual_element_id = state.get("provider_element_id")
if not actual_voice_id:
actual_voice_id = state.get("voice_id")
# 异步清理 Kling 资源(不阻塞前端)
provider = _get_kling_provider()
if actual_element_id:
try:
await provider.delete_element(str(actual_element_id))
except Exception as e:
logger.warning(f"delete_element failed: {e}")
if actual_voice_id:
try:
await provider.delete_custom_voice(actual_voice_id)
except Exception as e:
logger.warning(f"delete_custom_voice failed: {e}")
# 删除 Redis 任务记录
registry = JobRegistry(redis)
await registry.delete(avatar_id)
return success_response(data={"success": True, "message": "形象已删除"})
@router.get("/avatar/library", response_model=ApiResponse[list[AvatarItem]])
async def get_avatar_library(
current_user: dict = Depends(get_current_user),
):
"""
获取当前用户的克隆形象库
形象数据只保存在前端本地,后端不持久化。
此接口始终返回空列表,由前端从 localStorage/文件系统读取真实数据。
"""
return success_response(data=[])
@router.patch("/avatar/{avatar_id}", response_model=ApiResponse[dict])
async def update_avatar_name(
avatar_id: str,
data: UpdateAvatarNameRequest,
current_user: dict = Depends(get_current_user),
):
"""
更新形象名称
形象数据由前端本地管理,后端仅返回成功。
"""
new_name = data.name.strip()
if not new_name:
raise HTTPException(status_code=400, detail="名称不能为空")
return success_response(data={"success": True, "name": new_name})
# =============================================================================
# 管理和监控接口(用于排查问题和手动恢复)
# =============================================================================
class AvatarHealthResponse(BaseModel):
"""形象克隆服务健康状态"""
total_processing: int = Field(..., description="处理中的任务总数")
pending: int = Field(..., description="待处理任务数")
voice_processing: int = Field(..., description="音色生成中任务数")
element_processing: int = Field(..., description="主体生成中任务数")
stuck_tasks: int = Field(..., description="卡住任务数(超过30分钟)")
recent_failures: int = Field(..., description="最近1小时失败数")
@router.get("/avatar/health", response_model=ApiResponse[AvatarHealthResponse])
async def get_avatar_health(
current_user: dict = Depends(get_current_user),
):
"""
获取形象克隆服务健康状态
基于 Redis 运行中任务统计,不查询数据库。
"""
redis = get_redis_client()
registry = JobRegistry(redis)
job_ids = await registry.get_running_job_ids()
total_processing = 0
pending = 0
voice_processing = 0
element_processing = 0
stuck_tasks = 0
recent_failures = 0
now = datetime.now(UTC)
stuck_threshold = now.timestamp() - 30 * 60 # 30 分钟前
recent_threshold = now.timestamp() - 60 * 60 # 1 小时前
for job_id in job_ids:
state = await _get_avatar_state(redis, job_id)
if not state:
continue
# 只统计当前用户的任务(非管理员)
params = state.get("params", {}) if isinstance(state.get("params"), dict) else {}
if params.get("user_id") != str(current_user.id):
continue
job_type = state.get("type", "")
if job_type != "avatar_clone":
continue
avatar_status = state.get("avatar_status", state.get("status", ""))
total_processing += 1
if avatar_status == AvatarCloneStatus.PENDING.value:
pending += 1
elif avatar_status == AvatarCloneStatus.VOICE_PROCESSING.value:
voice_processing += 1
elif avatar_status == AvatarCloneStatus.ELEMENT_PROCESSING.value:
element_processing += 1
# 检查是否卡住(updated_at 超过 30 分钟)
updated_at_raw = state.get("updated_at", "")
if updated_at_raw:
try:
updated_ts = datetime.fromisoformat(updated_at_raw).timestamp()
if updated_ts < stuck_threshold and avatar_status in (
AvatarCloneStatus.PENDING.value,
AvatarCloneStatus.VOICE_PROCESSING.value,
AvatarCloneStatus.ELEMENT_PROCESSING.value,
):
stuck_tasks += 1
except ValueError:
pass
# 检查最近失败
if avatar_status in (
AvatarCloneStatus.VOICE_FAILED.value,
AvatarCloneStatus.ELEMENT_FAILED.value,
AvatarCloneStatus.TIMEOUT.value,
):
updated_at_raw = state.get("updated_at", "")
if updated_at_raw:
try:
updated_ts = datetime.fromisoformat(updated_at_raw).timestamp()
if updated_ts >= recent_threshold:
recent_failures += 1
except ValueError:
pass
return success_response(
data=AvatarHealthResponse(
total_processing=total_processing,
pending=pending,
voice_processing=voice_processing,
element_processing=element_processing,
stuck_tasks=stuck_tasks,
recent_failures=recent_failures,
)
)
@router.post("/avatar/admin/trigger-recovery", response_model=ApiResponse[dict])
async def admin_trigger_recovery(
current_user: dict = Depends(get_current_user),
):
"""
手动触发卡住任务恢复(管理员接口)
Async Engine 会自动轮询,无需手动触发恢复。
"""
# 权限检查:基于特定手机号判断管理员
is_admin = current_user.mobile in ["13800138000", "admin"]
if not is_admin:
raise HTTPException(status_code=403, detail="需要管理员权限")
return success_response(
data={
"message": "Async Engine 会持续自动轮询,无需手动触发恢复",
"task_id": None,
}
)
+1 -1
View File
@@ -226,7 +226,7 @@ async def upload_avatar(
- 分辨率: 高度 720px~2160px
- 内容: 写实风格人物正面特写,人脸清晰、无遮挡,视频中有清晰人声
文件存储路径: meijiaka/avatars/{userId}/{date}/{uuid}.{ext}
文件存储路径: meijiaka-zj/avatars/{userId}/{date}/{uuid}.{ext}
重复检测:
- 如果提供了 fileHash,会检查是否已有相同文件的任务在进行中
+6 -26
View File
@@ -6,16 +6,11 @@ API v1 路由聚合
from fastapi import APIRouter
from app.api.v1 import (
ai_models,
auth,
avatar,
caption,
klingai,
qiniu,
script,
system,
tasks,
video,
voice,
)
api_router = APIRouter()
@@ -23,29 +18,14 @@ api_router = APIRouter()
# 认证模块
api_router.include_router(auth.router, prefix="/auth", tags=["Authentication"])
# 脚本模块
api_router.include_router(script.router, prefix="/script", tags=["Script"])
# AI 平台管理模块
api_router.include_router(ai_models.router, prefix="/ai", tags=["AI Models"])
# KlingAI 模块(视频/图像生成)
api_router.include_router(klingai.router, tags=["KlingAI"])
# 七牛云对象存储模块
api_router.include_router(qiniu.router, tags=["Qiniu Storage"])
# 视频生成模块
api_router.include_router(video.router, tags=["Video"])
# 形象克隆模块
api_router.include_router(avatar.router, tags=["Avatar"])
# 系统模块
api_router.include_router(system.router, prefix="/system", tags=["System"])
# 任务管理模块
api_router.include_router(tasks.router, prefix="/tasks", tags=["Tasks"])
# 字幕生成模块(火山引擎-豆包语音)
api_router.include_router(caption.router, tags=["Caption"])
# 统一任务管理模块
api_router.include_router(tasks.router, tags=["Tasks"])
# 语音合成模块(TTS + 声音克隆)
api_router.include_router(voice.router, tags=["Voice"])
+55 -2
View File
@@ -29,7 +29,7 @@ from app.schemas.segment import Segment
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/tasks", tags=["Tasks"])
router = APIRouter(tags=["Tasks"])
# ========== 请求/响应模型 ==========
@@ -111,6 +111,21 @@ class CopyParams(BaseModel):
return v.strip()
class TTSParams(BaseModel):
"""TTS 语音合成参数"""
segments: list[dict] = Field(..., description="分镜列表,每项包含 id, text/voiceover")
voice_id: str = Field(default="zh_female_yizhi", description="音色 ID")
speed: float = Field(default=1.0, ge=0.5, le=2.0, description="语速")
@field_validator("segments")
@classmethod
def validate_segments(cls, v: list[dict]) -> list[dict]:
if not v:
raise ValueError("segments 不能为空列表")
return v
class TaskCreateRequest(BaseModel):
"""创建任务请求"""
@@ -154,7 +169,7 @@ def _generate_task_id() -> str:
@router.post("/{task_type}", response_model=TaskCreateResponse)
async def create_task(
task_type: Literal["video", "image", "script", "subtitle", "copy", "avatar_clone"],
task_type: Literal["video", "image", "script", "subtitle", "copy", "avatar_clone", "tts"],
request: TaskCreateRequest,
current_user: User = Depends(get_current_user),
) -> TaskCreateResponse:
@@ -343,6 +358,44 @@ async def create_task(
# 返回的任务 ID 用 avatar_id,保持前端兼容
task_id = avatar_id
elif task_type == "tts":
tts_params = dict(request.params)
raw_segments = tts_params.get("segments") or tts_params.get("texts", [])
if isinstance(raw_segments, list):
normalized = []
for i, seg in enumerate(raw_segments):
if isinstance(seg, dict):
normalized.append({
"id": seg.get("id", f"tts_{i}"),
"text": seg.get("text") or seg.get("voiceover", ""),
"index": i,
})
else:
normalized.append({
"id": f"tts_{i}",
"text": str(seg),
"index": i,
})
tts_params["segments"] = normalized
else:
raise ValueError("segments 必须为列表")
await registry.update(
task_id,
status="running",
message="准备语音合成...",
completed=0,
total=len(tts_params.get("segments", [])),
params={
"project_id": project_id,
"user_id": user_id,
"segments": json.dumps(tts_params.get("segments", []), ensure_ascii=False),
"voice_id": tts_params.get("voice_id", "zh_female_yizhi"),
"speed": tts_params.get("speed", 1.0),
},
)
await registry.add_running(task_id)
else:
raise HTTPException(status_code=400, detail=f"不支持的任务类型: {task_type}")
+1 -1
View File
@@ -449,7 +449,7 @@ async def download_video(video_id: str):
支持三种查找位置:
1. data/video/{video_id}.mp4 - 传统存储
2. data/uploads/{video_id}.ext - 上传文件
3. ~/Documents/Meijiaka/projects/*/videos/{video_id}.mp4 - 项目生成的视频
3. ~/Documents/Meijiaka-zj/projects/*/videos/{video_id}.mp4 - 项目生成的视频
文件名格式: scene_{shot_id}.mp4
"""
try:
+315
View File
@@ -0,0 +1,315 @@
"""
语音合成与克隆 API 路由
=======================
提供 TTS 语音合成、批量合成、声音克隆等功能。
基于 Kling AI TTS 和声音克隆 API。
"""
import logging
import tempfile
from pathlib import Path
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel, Field
from app.schemas.common import ApiResponse, success_response
from app.services.tts_service import TTSService
from app.services.voice_clone_service import VoiceCloneService
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/voice", tags=["Voice"])
# ========== 请求/响应模型 ==========
class TTSSynthesizeRequest(BaseModel):
"""TTS 合成请求"""
text: str = Field(..., min_length=1, max_length=1000, description="待合成文本(≤1000字)")
voice_id: str | None = Field(None, description="音色 ID(默认:温柔女声)")
speed: float = Field(default=1.0, ge=0.8, le=2.0, description="语速 0.8-2.0")
voice_language: str = Field(default="zh", description="音色语种 (zh/en)")
class TTSBatchSegment(BaseModel):
"""批量合成段落"""
text: str = Field(..., min_length=1, description="段落文本")
index: int = Field(default=0, ge=0, description="段落序号")
filename: str | None = Field(None, description="输出文件名(不含扩展名)")
class TTSBatchRequest(BaseModel):
"""批量 TTS 合成请求"""
segments: list[TTSBatchSegment] = Field(..., min_length=1, description="段落列表")
voice_id: str | None = Field(None, description="音色 ID")
speed: float = Field(default=1.0, ge=0.8, le=2.0, description="语速")
class VoiceCloneSubmitRequest(BaseModel):
"""声音克隆提交请求"""
source_audio_url: str | None = Field(None, description="源音频 URL5-30秒,mp3/wav,需公开可访问)")
source_video_url: str | None = Field(None, description="源视频 URL(可选)")
video_id: str | None = Field(None, description="历史作品ID(可选)")
voice_name: str | None = Field(None, description="自定义音色名称(≤20字符)")
class TTSBatchResponse(BaseModel):
"""批量合成结果"""
total: int
success_count: int
failed_count: int
results: list[dict]
class VoiceCloneTaskResponse(BaseModel):
"""克隆任务响应"""
task_id: str
status: str
voice_id: str | None = None
trial_url: str | None = None
error_message: str | None = None
class VoiceInfo(BaseModel):
"""音色信息"""
voice_id: str
name: str
description: str = ""
language: str = "zh"
recommended: bool = False
# ========== API 路由 ==========
@router.get("/voices", response_model=ApiResponse[list[VoiceInfo]])
async def list_voices():
"""
获取可用音色列表
返回预设的音色选项,用户可选择喜欢的音色进行 TTS 合成。
"""
voices = TTSService.get_preset_voices()
return success_response(
data=[VoiceInfo(**v) for v in voices],
message="获取音色列表成功",
)
@router.post("/synthesize", response_model=ApiResponse[dict])
async def synthesize_speech(request: TTSSynthesizeRequest):
"""
同步 TTS 合成
将文本转换为语音,返回音频 URL。
适用于短文本(≤1000字),长文本建议使用 /synthesize-batch。
"""
try:
service = TTSService()
audio_url = await service.synthesize_sync(
text=request.text,
voice_id=request.voice_id,
speed=request.speed,
voice_language=request.voice_language,
)
return success_response(
data={
"audio_url": audio_url,
"format": "mp3",
"text": request.text,
"voice_id": request.voice_id or "829826751244537879",
},
message="合成成功",
)
except ValueError as e:
logger.warning(f"[Voice] TTS 参数错误: {e}")
raise HTTPException(status_code=422, detail=str(e))
except Exception as e:
logger.error(f"[Voice] TTS 合成失败: {e}")
raise HTTPException(status_code=500, detail=f"合成失败: {str(e)}")
@router.post("/synthesize-batch", response_model=ApiResponse[TTSBatchResponse])
async def synthesize_batch(request: TTSBatchRequest):
"""
批量 TTS 合成
将多段文本批量转换为语音,保存到临时目录。
适用于长文本分段合成场景。
"""
try:
# 使用系统临时目录
output_dir = Path(tempfile.gettempdir()) / "meijiaka-zj_tts"
output_dir.mkdir(parents=True, exist_ok=True)
segments_data = [s.model_dump() for s in request.segments]
service = TTSService()
results = await service.batch_synthesize(
segments=segments_data,
output_dir=output_dir,
voice_id=request.voice_id,
speed=request.speed,
)
success_count = sum(1 for r in results if r["success"])
failed_count = len(results) - success_count
return success_response(
data=TTSBatchResponse(
total=len(results),
success_count=success_count,
failed_count=failed_count,
results=results,
),
message=f"批量合成完成:成功 {success_count} 段,失败 {failed_count}",
)
except Exception as e:
logger.error(f"[Voice] 批量 TTS 失败: {e}")
raise HTTPException(status_code=500, detail=f"批量合成失败: {str(e)}")
@router.post("/synthesize-file", response_model=ApiResponse[dict])
async def synthesize_to_file(request: TTSSynthesizeRequest, output_path: str):
"""
TTS 合成并保存到指定路径
将文本转换为语音并保存到指定文件路径。
"""
try:
service = TTSService()
saved_path = await service.synthesize_to_file(
text=request.text,
output_path=output_path,
voice_id=request.voice_id,
speed=request.speed,
voice_language=request.voice_language,
)
return success_response(
data={
"file_path": str(saved_path),
"text": request.text,
"voice_id": request.voice_id or "829826751244537879",
},
message="文件保存成功",
)
except ValueError as e:
logger.warning(f"[Voice] TTS 参数错误: {e}")
raise HTTPException(status_code=422, detail=str(e))
except Exception as e:
logger.error(f"[Voice] TTS 文件保存失败: {e}")
raise HTTPException(status_code=500, detail=f"保存失败: {str(e)}")
@router.post("/clone/submit", response_model=ApiResponse[VoiceCloneTaskResponse])
async def submit_clone_task(request: VoiceCloneSubmitRequest):
"""
提交声音克隆任务
提交音频/视频 URL 进行声音克隆,返回任务 ID 用于后续查询。
支持三种来源:source_audio_url、source_video_url、video_id。
"""
try:
service = VoiceCloneService()
task_id = await service.submit_clone_task(
source_audio_url=request.source_audio_url,
source_video_url=request.source_video_url,
video_id=request.video_id,
voice_name=request.voice_name,
)
return success_response(
data=VoiceCloneTaskResponse(
task_id=task_id,
status="pending",
),
message="克隆任务已提交",
)
except ValueError as e:
logger.warning(f"[Voice] 克隆参数错误: {e}")
raise HTTPException(status_code=422, detail=str(e))
except Exception as e:
logger.error(f"[Voice] 提交克隆任务失败: {e}")
raise HTTPException(status_code=500, detail=f"提交失败: {str(e)}")
@router.get("/clone/query/{task_id}", response_model=ApiResponse[VoiceCloneTaskResponse])
async def query_clone_task(task_id: str, blocking: bool = False):
"""
查询声音克隆任务状态
Args:
task_id: 任务 ID
blocking: 是否阻塞等待完成(默认 False)
"""
try:
service = VoiceCloneService()
result = await service.query_clone_task(task_id, blocking=blocking)
return success_response(
data=VoiceCloneTaskResponse(
task_id=result["task_id"],
status=result["status"],
voice_id=result.get("voice_id"),
trial_url=result.get("trial_url"),
error_message=result.get("error_message"),
)
)
except Exception as e:
logger.error(f"[Voice] 查询克隆任务失败: {e}")
raise HTTPException(status_code=500, detail=f"查询失败: {str(e)}")
@router.post("/clone/clone-and-wait", response_model=ApiResponse[VoiceCloneTaskResponse])
async def clone_and_wait(request: VoiceCloneSubmitRequest, poll_interval: float = 5.0):
"""
一站式克隆(提交并等待完成)
提交克隆任务并阻塞等待结果,直接返回最终状态。
适用于需要等待克隆完成的场景。
"""
try:
service = VoiceCloneService()
result = await service.wait_for_clone(
source_audio_url=request.source_audio_url,
source_video_url=request.source_video_url,
video_id=request.video_id,
voice_name=request.voice_name,
poll_interval=poll_interval,
)
return success_response(
data=VoiceCloneTaskResponse(
task_id=result["task_id"],
status=result["status"],
voice_id=result.get("voice_id"),
trial_url=result.get("trial_url"),
error_message=result.get("error_message"),
),
message=f"克隆任务完成,状态: {result['status']}",
)
except ValueError as e:
logger.warning(f"[Voice] 克隆参数错误: {e}")
raise HTTPException(status_code=422, detail=str(e))
except Exception as e:
logger.error(f"[Voice] 克隆失败: {e}")
raise HTTPException(status_code=500, detail=f"克隆失败: {str(e)}")
+1 -1
View File
@@ -91,7 +91,7 @@ class Settings(BaseSettings):
description="火山方舟默认模型(Model ID",
)
# 火山引擎音视频字幕服务
# 火山引擎音视频字幕服务(字幕功能仍使用火山引擎)
VOLCENGINE_CAPTION_APPID: str | None = Field(default=None, description="火山字幕 AppID")
VOLCENGINE_CAPTION_TOKEN: str | None = Field(default=None, description="火山字幕 Token")
-2
View File
@@ -10,10 +10,8 @@ CRUD 模块
user_obj = await user.get(db, id="xxx")
"""
from app.crud.model_usage import model_usage_log
from app.crud.user import user
__all__ = [
"user",
"model_usage_log",
]
-45
View File
@@ -1,45 +0,0 @@
"""
模型使用日志 CRUD 操作
======================
仅保留使用日志功能,模型配置已迁移到 YAML 文件。
"""
from sqlalchemy import func, select
from sqlalchemy.ext.asyncio import AsyncSession
from app.crud.base import CRUDBase
from app.models.model_usage import ModelUsageLog
class ModelUsageLogCRUD(CRUDBase[ModelUsageLog]):
"""模型使用日志 CRUD"""
def __init__(self) -> None:
super().__init__(ModelUsageLog)
async def get_daily_cost(self, db: AsyncSession, *, date: str) -> float:
"""获取某日总成本"""
result = await db.execute(
select(func.sum(ModelUsageLog.cost_cny)).where(
func.date(ModelUsageLog.created_at) == date
)
)
return result.scalar() or 0.0
async def get_by_user(
self, db: AsyncSession, *, user_id: str, skip: int = 0, limit: int = 100
) -> list[ModelUsageLog]:
"""获取用户的使用日志"""
result = await db.execute(
select(ModelUsageLog)
.where(ModelUsageLog.user_id == user_id)
.order_by(ModelUsageLog.created_at.desc())
.offset(skip)
.limit(limit)
)
return list(result.scalars().all())
# 导出实例
model_usage_log = ModelUsageLogCRUD()
+4 -4
View File
@@ -26,7 +26,7 @@ log_format = "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
log_level = getattr(logging, settings.LOG_LEVEL)
# 创建日志目录(在用户文档目录下)
log_dir = Path.home() / "Documents" / "Meijiaka" / "logs"
log_dir = Path.home() / "Documents" / "Meijiaka-zj" / "logs"
log_dir.mkdir(parents=True, exist_ok=True)
# 日志文件名按日期
@@ -60,7 +60,7 @@ async def lifespan(app: FastAPI):
logger.info("Initializing database tables...")
try:
# 确保所有模型已注册到 metadata
from app.models import Avatar, ModelUsageLog, User # noqa: F401
from app.models import User # noqa: F401
await init_db()
logger.info("Database tables initialized")
@@ -93,7 +93,7 @@ def create_app() -> FastAPI:
app = FastAPI(
title=settings.APP_NAME,
version=settings.APP_VERSION,
description="美家卡智 - AI 视频创作后端 API",
description="美家卡智 - AI 视频创作后端 API",
docs_url="/docs" if settings.DEBUG else None,
redoc_url="/redoc" if settings.DEBUG else None,
lifespan=lifespan,
@@ -153,7 +153,7 @@ def create_app() -> FastAPI:
"version": settings.APP_VERSION,
"docs": "/docs" if settings.DEBUG else None,
},
message="美家卡智 API 服务",
message="美家卡智 API 服务",
)
return app
-4
View File
@@ -6,15 +6,11 @@
注意:AIModel/AIPlatform 已迁移到 YAML 配置 (config/ai_models.yaml)
"""
from app.models.avatar import Avatar
from app.models.base import BaseModel
from app.models.model_usage import ModelUsageLog
from app.models.user import User
# 当前可用的模型
__all__ = [
"Avatar",
"BaseModel",
"ModelUsageLog",
"User",
]
-140
View File
@@ -1,140 +0,0 @@
"""
Avatar 形象克隆模型
==================
存储用户克隆形象的信息,作为本地 localStorage 的云端备份。
"""
from datetime import UTC, datetime
from sqlalchemy import BigInteger, DateTime, ForeignKey, String, Text
from sqlalchemy.dialects.postgresql import UUID
from sqlalchemy.orm import Mapped, mapped_column
from app.db.session import Base
from app.schemas.enums import AvatarCloneStatus
class Avatar(Base):
"""
形象克隆记录表
用于备份用户在本地创建的克隆形象,支持换机恢复和客服排查。
"""
__tablename__ = "avatars"
# 主键:本地生成的唯一标识(与 Kling element_id 无关)
id: Mapped[str] = mapped_column(
String(64),
primary_key=True,
comment="本地形象唯一标识(如 avt_xxx",
)
# 关联用户(外键,对应 users.id)
user_id: Mapped[str] = mapped_column(
UUID(as_uuid=False),
ForeignKey("users.id", ondelete="CASCADE"),
nullable=False,
index=True,
comment="关联用户 ID",
)
# 形象展示名称
name: Mapped[str] = mapped_column(
String(64),
nullable=False,
comment="形象展示名称",
)
# 供应商标识
provider: Mapped[str] = mapped_column(
String(32),
nullable=False,
default="kling",
comment="供应商标识: kling",
)
# Kling 自定义音色 ID(创建成功后回填)
voice_id: Mapped[str | None] = mapped_column(
String(64),
nullable=True,
comment="Kling 自定义音色 ID",
)
# 供应商主体 ID(创建成功后回填,用于调用 omni-video API
provider_element_id: Mapped[int | None] = mapped_column(
BigInteger,
nullable=True,
comment="供应商主体 ID(数字类型,调用 API 时使用)",
)
# 供应商任务 ID(用于客服追溯)
provider_voice_job_id: Mapped[str | None] = mapped_column(
String(128),
nullable=True,
index=True,
comment="供应商自定义音色任务 ID",
)
provider_element_job_id: Mapped[str | None] = mapped_column(
String(128),
nullable=True,
index=True,
comment="供应商主体创建任务 ID",
)
# 资源地址
video_url: Mapped[str] = mapped_column(
Text,
nullable=False,
comment="原始人物视频 URL",
)
trial_url: Mapped[str | None] = mapped_column(
Text,
nullable=True,
comment="音色试听音频 URL",
)
# 状态机
status: Mapped[str] = mapped_column(
String(32),
nullable=False,
default=AvatarCloneStatus.PENDING.value,
comment="状态: pending/voice_processing/voice_failed/element_processing/element_failed/succeed/timeout",
)
# 失败原因(用户可读)
fail_reason: Mapped[str | None] = mapped_column(
Text,
nullable=True,
comment="失败原因(中文可读)",
)
# 软删除标记
deleted_at: Mapped[datetime | None] = mapped_column(
DateTime(timezone=True),
nullable=True,
comment="软删除时间,NULL 表示未删除",
)
# 时间戳
created_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True),
default=lambda: datetime.now(UTC),
nullable=False,
comment="记录创建时间",
)
updated_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True),
default=lambda: datetime.now(UTC),
onupdate=lambda: datetime.now(UTC),
nullable=False,
comment="记录更新时间",
)
def to_dict(self) -> dict:
"""转换为字典(用于序列化)"""
return {column.name: getattr(self, column.name) for column in self.__table__.columns}
-62
View File
@@ -1,62 +0,0 @@
"""
AI 模型使用日志模型
==================
存储模型调用的使用日志,用于成本统计和监控。
模型配置已迁移到 YAML 文件:config/ai_models.yaml
"""
from datetime import datetime
from sqlalchemy import Boolean, Column, DateTime, Float, Index, Integer, String, Text
from app.db.session import Base
class ModelUsageLog(Base):
"""模型使用日志 - 用于成本统计和监控"""
__tablename__ = "model_usage_logs"
id = Column(Integer, primary_key=True, autoincrement=True)
# 调用信息
model_id = Column(String(100), nullable=False)
platform_id = Column(String(50), nullable=False)
# 调用类型
task_type = Column(String(50), nullable=False) # script、polish、chat
# Token 用量
prompt_tokens = Column(Integer, default=0)
completion_tokens = Column(Integer, default=0)
total_tokens = Column(Integer, default=0)
# 成本(计算后的人民币)
cost_cny = Column(Float, default=0.0)
# 性能
response_time_ms = Column(Integer, nullable=True) # 响应时间
# 结果
success = Column(Boolean, default=True)
error_message = Column(Text, nullable=True)
# 用户/项目
user_id = Column(String(50), nullable=True)
project_id = Column(String(50), nullable=True)
# 时间
created_at = Column(DateTime, default=datetime.utcnow)
# 索引定义
__table_args__ = (
# 索引:按用户查询使用记录
Index("ix_model_usage_logs_user_id", "user_id"),
# 索引:按时间查询(用于统计)
Index("ix_model_usage_logs_created_at", "created_at"),
)
def __repr__(self):
return f"<UsageLog {self.model_id}: {self.total_tokens} tokens>"
@@ -100,7 +100,7 @@ class ImageHandler(AsyncHandler):
image_url = images[0].get("url")
image_dir = (
Path.home() / "Documents" / "Meijiaka" / "projects" / project_id / "images"
Path.home() / "Documents" / "Meijiaka-zj" / "projects" / project_id / "images"
)
image_dir.mkdir(parents=True, exist_ok=True)
ext = ".jpg" if ".jpg" in image_url else ".png"
@@ -9,11 +9,11 @@ Script 任务处理器
import logging
from typing import Any
from app.ai.prompts import TOPIC_PROMPT_MAP
from app.scheduler.handlers.base import AsyncHandler
from app.scheduler.models import StateChange
from app.scheduler.registry import JobRegistry
from app.scheduler.slot_manager import SlotManager
from app.services.anytocopy_service import get_anytocopy_service
from app.services.script_service import ScriptService
logger = logging.getLogger(__name__)
@@ -70,40 +70,53 @@ class ScriptHandler(AsyncHandler):
try:
await __import__("asyncio").sleep(2)
anytocopy = get_anytocopy_service()
extract_result = await anytocopy.extract_text_from_input(topic)
extracted_info = None
actual_topic = topic
is_video_url = extract_result.get("is_video_url", False)
if is_video_url:
await registry.update(
job.job_id,
progress=30,
message="提取视频素材中...",
)
video_info = extract_result.get("video_info")
if video_info:
extracted_info = {
"title": video_info.title,
"content": video_info.content,
"text_content": video_info.text_content,
"platform": video_info.platform,
"duration": video_info.duration,
"original_url": topic,
}
actual_topic = extract_result.get("extracted_text") or topic
await registry.update(
job.job_id,
progress=60,
message="生成脚本中...",
)
else:
# 判断是否为预设主题
is_preset_topic = topic in TOPIC_PROMPT_MAP
extracted_info = None
actual_topic = topic # 默认使用原始 topic
if is_preset_topic:
await registry.update(
job.job_id,
progress=40,
message="构思脚本中...",
)
else:
# 非预设主题:检测并提取视频链接中的文案
from app.services.anytocopy_service import get_anytocopy_service
anytocopy = get_anytocopy_service()
extract_result = await anytocopy.extract_text_from_input(topic)
if extract_result.get("is_video_url"):
await registry.update(
job.job_id,
progress=30,
message="提取视频素材中...",
)
video_info = extract_result.get("video_info")
if video_info:
extracted_info = {
"title": video_info.title,
"content": video_info.content,
"text_content": video_info.text_content,
"platform": video_info.platform,
"duration": video_info.duration,
"original_url": topic,
}
actual_topic = extract_result.get("extracted_text") or topic
await registry.update(
job.job_id,
progress=60,
message="生成脚本中...",
)
else:
await registry.update(
job.job_id,
progress=40,
message="构思脚本中...",
)
service = ScriptService()
shots = await service.generate_script(
@@ -0,0 +1,166 @@
"""
TTS 任务处理器
==============
管理 Kling AI TTS 的异步语音合成任务。
每个 Job 包含多个分镜的旁白文本,逐段合成后汇总结果。
"""
import json
import logging
import tempfile
from pathlib import Path
from typing import Any
from app.scheduler.handlers.base import AsyncHandler
from app.scheduler.models import StateChange
from app.scheduler.registry import JobRegistry
from app.scheduler.slot_manager import SlotManager
from app.services.tts_service import TTSService
logger = logging.getLogger(__name__)
SLOT_KEY = "tts:slots"
MAX_SLOTS = 10
class TTSHandler(AsyncHandler):
"""TTS 异步任务处理器"""
name = "tts"
slot_key = SLOT_KEY
max_slots = MAX_SLOTS
async def tick(
self, jobs: list[Any], registry: JobRegistry, slots: SlotManager
) -> list[StateChange]:
"""
处理 TTS 任务:
1. 逐段合成语音(每段占用一个槽位)
2. 实时更新进度 completed / result
3. 完成后标记任务为 completed
"""
changes: list[StateChange] = []
for job in jobs:
params = job.params or {}
segments: list[dict] = json.loads(params.get("segments", "[]"))
voice_id = params.get("voice_id", "zh_female_yizhi")
speed = float(params.get("speed", 1.0))
output_dir = params.get("output_dir")
if not segments:
changes.append(StateChange(job_id=job.job_id, field_path="status", value="failed"))
changes.append(StateChange(job_id=job.job_id, field_path="message", value="没有待合成的分镜"))
changes.append(StateChange(job_id=job.job_id, field_path="error", value="空分镜列表"))
continue
total = len(segments)
completed = job.completed or 0
results = job.result.get("audio_results", []) if job.result else []
# 找到下一个未处理的段落
next_index = completed
if next_index >= total:
# 所有段落已处理完毕
if job.status != "completed":
changes.append(StateChange(job_id=job.job_id, field_path="status", value="completed"))
changes.append(StateChange(job_id=job.job_id, field_path="progress", value=100))
changes.append(StateChange(job_id=job.job_id, field_path="message", value="语音合成完成"))
# 汇总成功/失败数
success_count = sum(1 for r in results if r.get("success"))
failed_count = total - success_count
changes.append(StateChange(job_id=job.job_id, field_path="result", value={
"audio_results": results,
"total": total,
"success_count": success_count,
"failed_count": failed_count,
}))
await slots.release(SLOT_KEY, job.job_id)
continue
# 获取当前段落
segment = segments[next_index]
segment_id = segment.get("id", f"seg_{next_index}")
text = segment.get("text") or segment.get("voiceover", "")
if not text or not text.strip():
# 空文本,跳过
logger.warning(f"[TTS] 段落 {next_index} 文本为空,跳过")
results.append({
"segment_id": segment_id,
"index": next_index,
"text": "",
"audio_path": None,
"success": False,
"error": "文本为空",
})
changes.append(StateChange(job_id=job.job_id, field_path="completed", value=next_index + 1))
changes.append(StateChange(job_id=job.job_id, field_path="progress", value=int((next_index + 1) / total * 100)))
changes.append(StateChange(job_id=job.job_id, field_path="message", value=f"合成中... {next_index + 1}/{total}"))
changes.append(StateChange(job_id=job.job_id, field_path="result", value={"audio_results": results}))
continue
# 尝试获取槽位
acquired = await slots.acquire(SLOT_KEY, job.job_id, MAX_SLOTS)
if not acquired:
# 槽位已满,等待下一次 tick
changes.append(StateChange(job_id=job.job_id, field_path="message", value=f"等待槽位... {next_index}/{total}"))
continue
# 执行合成
try:
service = TTSService()
if not output_dir:
output_dir = str(Path(tempfile.gettempdir()) / "meijiaka-zj_tts")
params["output_dir"] = output_dir
changes.append(StateChange(job_id=job.job_id, field_path="params", value=params))
output_path = Path(output_dir) / f"tts_{job.job_id}_{next_index:04d}.mp3"
await service.synthesize_to_file(
text=text,
output_path=output_path,
voice_id=voice_id,
speed=speed,
)
results.append({
"segment_id": segment_id,
"index": next_index,
"text": text,
"audio_path": str(output_path),
"success": True,
"error": None,
})
completed = next_index + 1
changes.append(StateChange(job_id=job.job_id, field_path="completed", value=completed))
changes.append(StateChange(job_id=job.job_id, field_path="progress", value=int(completed / total * 100)))
changes.append(StateChange(job_id=job.job_id, field_path="message", value=f"合成中... {completed}/{total}"))
changes.append(StateChange(job_id=job.job_id, field_path="result", value={"audio_results": results}))
# 成功时释放槽位
await slots.release(SLOT_KEY, job.job_id)
except Exception as e:
logger.error(f"[TTS] 段落 {next_index} 合成失败: {e}")
results.append({
"segment_id": segment_id,
"index": next_index,
"text": text,
"audio_path": None,
"success": False,
"error": str(e),
})
completed = next_index + 1
changes.append(StateChange(job_id=job.job_id, field_path="completed", value=completed))
changes.append(StateChange(job_id=job.job_id, field_path="progress", value=int(completed / total * 100)))
changes.append(StateChange(job_id=job.job_id, field_path="message", value=f"合成失败: {str(e)[:50]}"))
changes.append(StateChange(job_id=job.job_id, field_path="result", value={"audio_results": results}))
# 失败时也释放槽位(允许后续重试或其他任务使用)
await slots.release(SLOT_KEY, job.job_id)
return changes
@@ -57,7 +57,7 @@ class VideoHandler(AsyncHandler):
return self._provider
def _get_project_video_dir(self, project_id: str) -> Path:
video_dir = Path.home() / "Documents" / "Meijiaka" / "projects" / project_id / "videos"
video_dir = Path.home() / "Documents" / "Meijiaka-zj" / "projects" / project_id / "videos"
video_dir.mkdir(parents=True, exist_ok=True)
return video_dir
+2
View File
@@ -15,6 +15,7 @@ from app.scheduler.handlers.copy_handler import CopyHandler
from app.scheduler.handlers.image_handler import ImageHandler
from app.scheduler.handlers.script_handler import ScriptHandler
from app.scheduler.handlers.subtitle_handler import SubtitleHandler
from app.scheduler.handlers.tts_handler import TTSHandler
from app.scheduler.handlers.video_handler import VideoHandler
logger = logging.getLogger("scheduler")
@@ -38,6 +39,7 @@ async def main() -> None:
engine.register(SubtitleHandler())
engine.register(CopyHandler())
engine.register(ScriptHandler())
engine.register(TTSHandler())
await engine.run_forever(interval=10.0, min_interval=2.0)
@@ -7,7 +7,7 @@ Kling 视频生成服务
- 空镜视频生成(文生图 + 图生视频)- 空镜场景
- 并发控制、轮询查询、视频下载
存储路径:~/Documents/Meijiaka/projects/{project_id}/videos/
存储路径:~/Documents/Meijiaka-zj/projects/{project_id}/videos/
命名规则:scene_{segment_id}.mp4
空镜生成新流程:
@@ -62,7 +62,7 @@ class KlingVideoService:
TIMEOUT = 600
# 视频存储根目录
BASE_STORAGE_DIR = Path.home() / "Documents" / "Meijiaka" / "projects"
BASE_STORAGE_DIR = Path.home() / "Documents" / "Meijiaka-zj" / "projects"
def __init__(self):
self._provider: KlingAIProvider | None = None
+1 -1
View File
@@ -95,7 +95,7 @@ class QiniuService:
return self.video_bucket, self.video_domain
# 项目前缀
PROJECT_PREFIX = "meijiaka"
PROJECT_PREFIX = "meijiaka-zj"
def generate_key(self, file_type: str, original_filename: str, user_id: str = None) -> str:
"""
+42 -24
View File
@@ -12,7 +12,7 @@ from collections.abc import AsyncIterator
from pathlib import Path
from app.ai.model_router import get_model_router
from app.ai.prompts import load_script_system, load_script_user
from app.ai.prompts import load_script_system, load_script_user_prompt, load_topic_prompt, TOPIC_PROMPT_MAP
from app.schemas.script import ScriptGenerationEvent, ScriptShot
from app.services.ai_response_utils import (
safe_parse_ai_json_response,
@@ -156,7 +156,7 @@ class ScriptService:
同步生成脚本
Args:
topic: 创作主题(支持视频链接,自动提取文案
topic: 创作主题(预设主题名或自定义输入/视频链接
duration: 视频时长(秒)
script_type: 脚本类型
model: 指定模型
@@ -164,30 +164,39 @@ class ScriptService:
Returns:
分镜列表
"""
# 1. 检测并提取视频链接中的文案
anytocopy = get_anytocopy_service()
extract_result = await anytocopy.extract_text_from_input(topic)
# 1. 判断是否为预设主题
is_preset_topic = topic in TOPIC_PROMPT_MAP
if extract_result["error"]:
logger.warning(f"视频文案提取失败: {extract_result['error']}")
# 提取失败但不中断,使用原始输入
# 2. 根据类型决定处理方式
actual_topic = topic
if not is_preset_topic:
# 非预设主题:检测并提取视频链接中的文案
anytocopy = get_anytocopy_service()
extract_result = await anytocopy.extract_text_from_input(topic)
if extract_result["is_video_url"]:
logger.info(f"检测到视频链接,提取文案长度: {len(extract_result['extracted_text'])}")
# 使用提取的文案作为创作主题
topic = extract_result["extracted_text"] or topic
if extract_result["error"]:
logger.warning(f"视频文案提取失败: {extract_result['error']}")
# 提取失败但不中断,使用原始输入
# 2. 获取 model_router
if extract_result["is_video_url"]:
logger.info(f"检测到视频链接,提取文案长度: {len(extract_result['extracted_text'])}")
# 使用提取的文案作为创作主题
actual_topic = extract_result["extracted_text"] or topic
# 3. 获取 model_router
model_router = await get_model_router()
# 加载 Prompt(使用新的 loader
system_prompt = load_script_system()
user_prompt = load_script_user(
# 4. 加载 Prompt
# 系统提示词:预设主题用专用提示词,否则用通用提示词
system_prompt = load_topic_prompt(topic) if is_preset_topic else load_script_system()
# 用户提示词
user_prompt = load_script_user_prompt(
topic=topic,
duration=duration,
script_type=script_type,
)
logger.info(f"同步生成脚本: topic={topic}, is_preset={is_preset_topic}, duration={duration}")
logger.info(f"同步生成脚本: topic={topic[:20]}, duration={duration}")
# 调用 AI 生成
@@ -240,7 +249,7 @@ class ScriptService:
"""
流式生成脚本(SSE)- 优化版
支持视频链接自动提取文案。
支持预设主题和视频链接自动提取文案。
进度设计:
- 0-5%: start(初始化)
@@ -253,13 +262,19 @@ class ScriptService:
model_router = await get_model_router()
start_time = time.time()
# 1. 检测并提取视频链接中的文案
# 1. 判断是否为预设主题
is_preset_topic = topic in TOPIC_PROMPT_MAP
# 2. 非预设主题时,检测并提取视频链接中的文案
original_topic = topic
anytocopy = get_anytocopy_service()
extracted_info = None # 保存提取的视频信息
actual_topic = topic
# 检查是否为视频链接
if AnyToCopyService.is_video_url(topic) or AnyToCopyService.extract_url_from_text(topic):
# 检查是否为视频链接(非预设主题才检测)
if not is_preset_topic and (
AnyToCopyService.is_video_url(topic) or AnyToCopyService.extract_url_from_text(topic)
):
yield ScriptGenerationEvent(
type="analyzing",
progress=5,
@@ -300,13 +315,16 @@ class ScriptService:
try:
# 加载 Prompt
system_prompt = load_script_system()
user_prompt = load_script_user(
# 系统提示词:预设主题用专用提示词,否则用通用提示词
system_prompt = load_topic_prompt(topic) if is_preset_topic else load_script_system()
# 用户提示词
user_prompt = load_script_user_prompt(
topic=topic,
duration=duration,
script_type=script_type,
)
logger.info(f"流式生成脚本: topic={topic}, is_preset={is_preset_topic}, duration={duration}")
# 1. 开始阶段(0-5%
yield ScriptGenerationEvent(
type="start",
+259
View File
@@ -0,0 +1,259 @@
"""
TTS 服务层
==========
封装 Kling AI TTS API,提供语音合成能力。
API 文档:https://klingai.com/document-api
"""
import asyncio
import logging
from pathlib import Path
from app.ai.providers.klingai_provider import KlingAIProvider
from app.config import get_settings
logger = logging.getLogger(__name__)
# Kling TTS API 配置
TTS_TASK_TIMEOUT = 120 # TTS 任务最大等待时间(秒)
TTS_POLL_INTERVAL = 2.0 # 轮询间隔(秒)
def _get_kling_provider() -> KlingAIProvider:
"""获取 KlingAI Provider 实例"""
settings = get_settings()
config = {
"access_key": settings.KLINGAI_ACCESS_KEY or "",
"secret_key": settings.KLINGAI_SECRET_KEY or "",
}
return KlingAIProvider(config)
class TTSService:
"""Kling AI TTS 服务客户端"""
# Kling 官方预设音色(已知音色)
PRESET_VOICES = [
{
"voice_id": "829824295735410756",
"name": "钓系女友",
"language": "zh",
"description": "甜美撒娇",
},
{
"voice_id": "829826751244537879",
"name": "温柔女声",
"language": "zh",
"description": "温柔细腻",
},
{
"voice_id": "829826792415842333",
"name": "播报男声",
"language": "zh",
"description": "沉稳播报",
},
{
"voice_id": "829826834144964676",
"name": "盐系少年",
"language": "zh",
"description": "清新少年",
},
{
"voice_id": "829826884271091753",
"name": "撒娇女友",
"language": "zh",
"description": "可爱撒娇",
},
]
def __init__(self) -> None:
self.provider = _get_kling_provider()
self.default_voice_id = "829826751244537879" # 温柔女声
async def synthesize_sync(
self,
text: str,
voice_id: str | None = None,
speed: float = 1.0,
voice_language: str = "zh",
) -> str:
"""
同步合成语音(提交任务并等待完成),返回音频 URL。
Args:
text: 待合成文本(≤1000字符)
voice_id: 音色 ID(默认使用温柔女声)
speed: 语速 (0.8-2.0)
voice_language: 语言 (zh/en)
Returns:
音频 URL
Raises:
ValueError: 参数校验失败
TimeoutError: 等待超时
"""
if not text or not text.strip():
raise ValueError("text 不能为空")
if len(text) > 1000:
raise ValueError("text 不能超过 1000 字符")
voice = voice_id or self.default_voice_id
# 提交 TTS 任务
result = await self.provider.generate_tts(
text=text,
voice_id=voice,
voice_language=voice_language,
voice_speed=speed,
)
task_id = result.get("task_id")
if not task_id:
raise ValueError("TTS 任务提交失败: 未返回 task_id")
logger.info(f"[TTS] 任务已提交: task_id={task_id}")
# 等待任务完成
audio_url = await self._wait_for_task(task_id)
return audio_url
async def _wait_for_task(self, task_id: str) -> str:
"""等待 TTS 任务完成并返回音频 URL"""
elapsed = 0.0
while elapsed < TTS_TASK_TIMEOUT:
await asyncio.sleep(TTS_POLL_INTERVAL)
elapsed += TTS_POLL_INTERVAL
result = await self.provider.get_tts_task(task_id)
status = result.get("status") or result.get("task_status", "")
logger.debug(f"[TTS] task_id={task_id}, status={status}, elapsed={elapsed}s")
if status == "succeed":
# 任务成功,返回音频 URL
task_result = result.get("task_result", {})
audio_url = task_result.get("audio_url") if isinstance(task_result, dict) else None
if audio_url:
return audio_url
# 某些响应格式直接放在 data 中
return result.get("audio_url") or result.get("data", {}).get("audio_url", "")
if status in ("failed", "error"):
raise ValueError(f"TTS 任务失败: {result.get('message', '未知错误')}")
raise TimeoutError(f"TTS 任务等待超时({TTS_TASK_TIMEOUT}秒)")
async def synthesize_to_file(
self,
text: str,
output_path: str | Path,
voice_id: str | None = None,
speed: float = 1.0,
voice_language: str = "zh",
) -> Path:
"""
合成语音并保存到文件。
Args:
text: 待合成文本
output_path: 输出文件路径
voice_id: 音色 ID
speed: 语速
voice_language: 语言
Returns:
输出文件路径
"""
import httpx
output_path = Path(output_path)
output_path.parent.mkdir(parents=True, exist_ok=True)
# 获取音频 URL
audio_url = await self.synthesize_sync(
text=text,
voice_id=voice_id,
speed=speed,
voice_language=voice_language,
)
# 下载音频并保存
async with httpx.AsyncClient(timeout=60.0) as client:
response = await client.get(audio_url)
response.raise_for_status()
audio_bytes = response.content
output_path.write_bytes(audio_bytes)
logger.info(f"[TTS] 语音合成完成: {output_path}")
return output_path
async def batch_synthesize(
self,
segments: list[dict],
output_dir: str | Path,
voice_id: str | None = None,
speed: float = 1.0,
) -> list[dict]:
"""
批量合成多段语音。
Args:
segments: 分段列表,每项包含 text, index(可选), filename(可选)
output_dir: 输出目录
voice_id: 音色 ID
speed: 语速
Returns:
结果列表,每项包含 input(原始输入)和 output(输出文件路径或错误信息)
"""
output_dir = Path(output_dir)
output_dir.mkdir(parents=True, exist_ok=True)
results = []
for seg in segments:
text = seg.get("text", "")
index = seg.get("index", len(results))
filename = seg.get("filename", f"audio_{index:04d}.mp3")
try:
output_path = await self.synthesize_to_file(
text=text,
output_path=output_dir / filename,
voice_id=voice_id,
speed=speed,
)
results.append({
"index": index,
"text": text,
"output_path": str(output_path),
"success": True,
"error": None,
})
except Exception as e:
logger.error(f"[TTS] 分段 {index} 合成失败: {e}")
results.append({
"index": index,
"text": text,
"output_path": None,
"success": False,
"error": str(e),
})
return results
@staticmethod
def get_preset_voices() -> list[dict]:
"""获取预设音色列表"""
return TTSService.PRESET_VOICES
@staticmethod
def get_voice_by_id(voice_id: str) -> dict | None:
"""根据 ID 获取音色信息"""
for voice in TTSService.PRESET_VOICES:
if voice["voice_id"] == voice_id:
return voice
return None
@@ -0,0 +1,258 @@
"""
语音克隆服务层
=============
封装 Kling AI 声音克隆 API,提供个性化音色克隆能力。
API 文档:https://klingai.com/document-api
"""
import asyncio
import logging
from enum import Enum
from app.ai.providers.klingai_provider import KlingAIProvider
from app.config import get_settings
logger = logging.getLogger(__name__)
# 克隆任务配置
CLONE_TASK_TIMEOUT = 600 # 克隆任务最大等待时间(秒)
CLONE_POLL_INTERVAL = 5.0 # 轮询间隔(秒)
def _get_kling_provider() -> KlingAIProvider:
"""获取 KlingAI Provider 实例"""
settings = get_settings()
config = {
"access_key": settings.KLINGAI_ACCESS_KEY or "",
"secret_key": settings.KLINGAI_SECRET_KEY or "",
}
return KlingAIProvider(config)
class CloneTaskStatus(Enum):
"""克隆任务状态(字符串枚举)"""
PENDING = "pending" # 任务已提交,等待处理
PROCESSING = "processing" # 正在处理
SUCCEEDED = "succeeded" # 成功
FAILED = "failed" # 失败
TIMEOUT = "timeout" # 超时
class VoiceCloneService:
"""Kling AI 声音克隆服务客户端"""
def __init__(self) -> None:
self.provider = _get_kling_provider()
self.timeout = CLONE_TASK_TIMEOUT
async def submit_clone_task(
self,
source_audio_url: str | None = None,
source_video_url: str | None = None,
video_id: str | None = None,
voice_name: str | None = None,
callback_url: str | None = None,
external_task_id: str | None = None,
) -> str:
"""
提交声音克隆任务。
Args:
source_audio_url: 源音频 URL5-30秒,mp3/wav格式,需公开可访问)
source_video_url: 源视频 URL(可选)
video_id: 历史作品ID(可选,通过已有作品克隆音色)
voice_name: 自定义音色名称(≤20字符)
callback_url: 回调地址
external_task_id: 自定义任务ID
Returns:
克隆任务 ID
Raises:
ValueError: 参数校验失败
"""
if not source_audio_url and not source_video_url and not video_id:
raise ValueError("必须提供 source_audio_url、source_video_url 或 video_id 之一")
if source_audio_url and not source_audio_url.startswith(("http://", "https://")):
raise ValueError("source_audio_url 必须是有效的 URL")
if source_video_url and not source_video_url.startswith(("http://", "https://")):
raise ValueError("source_video_url 必须是有效的 URL")
if voice_name and len(voice_name) > 20:
raise ValueError("voice_name 不能超过 20 字符")
# 提交克隆任务
result = await self.provider.create_custom_voice(
voice_name=voice_name or "自定义音色",
audio_url=source_audio_url,
video_url=source_video_url,
video_id=video_id,
callback_url=callback_url,
external_task_id=external_task_id,
)
# Kling API 返回 task_id
task_id = result.get("task_id")
if not task_id:
raise ValueError("提交克隆任务失败: 未返回 task_id")
logger.info(f"[VoiceClone] 提交任务成功: task_id={task_id}")
return task_id
async def query_clone_task(self, task_id: str, blocking: bool = False) -> dict:
"""
查询声音克隆任务状态。
Args:
task_id: 任务 ID
blocking: 是否阻塞等待(False 则立即返回当前状态)
Returns:
任务状态信息,包含字段:
- task_id: 任务 ID
- status: 任务状态 (pending/processing/succeeded/failed/timeout)
- voice_id: 克隆成功的音色 ID(如已完成)
- trial_url: 试听地址(如已完成)
- error_message: 错误信息(如失败)
"""
# Kling 使用不同的查询接口
result = await self.provider.get_custom_voice_task(task_id)
status = result.get("task_status", "pending")
# 映射状态
status_map = {
"pending": CloneTaskStatus.PENDING.value,
"processing": CloneTaskStatus.PROCESSING.value,
"succeed": CloneTaskStatus.SUCCEEDED.value,
"failed": CloneTaskStatus.FAILED.value,
}
mapped_status = status_map.get(status, status)
ret = {
"task_id": task_id,
"status": mapped_status,
"voice_id": None,
"trial_url": None,
"error_message": None,
}
# 提取音色信息
if mapped_status == CloneTaskStatus.SUCCEEDED.value:
task_result = result.get("task_result", {})
if isinstance(task_result, dict):
voices = task_result.get("voices", [])
if voices and len(voices) > 0:
ret["voice_id"] = voices[0].get("voice_id")
ret["trial_url"] = voices[0].get("trial_url")
if mapped_status == CloneTaskStatus.FAILED.value:
ret["error_message"] = result.get("message", "任务失败")
if blocking and mapped_status in (CloneTaskStatus.PENDING.value, CloneTaskStatus.PROCESSING.value):
ret = await self._wait_for_completion(task_id)
return ret
async def _wait_for_completion(self, task_id: str, poll_interval: float = CLONE_POLL_INTERVAL) -> dict:
"""
阻塞等待克隆任务完成。
Args:
task_id: 任务 ID
poll_interval: 轮询间隔(秒)
Returns:
最终任务状态
"""
elapsed = 0.0
while elapsed < self.timeout:
await asyncio.sleep(poll_interval)
elapsed += poll_interval
result = await self.query_clone_task(task_id, blocking=False)
status = result.get("status", "pending")
logger.debug(f"[VoiceClone] task_id={task_id}, status={status}, elapsed={elapsed}s")
if status in (CloneTaskStatus.SUCCEEDED.value, CloneTaskStatus.FAILED.value):
return result
# 超时
logger.warning(f"[VoiceClone] task_id={task_id} 等待超时")
return {
"task_id": task_id,
"status": CloneTaskStatus.TIMEOUT.value,
"voice_id": None,
"trial_url": None,
"error_message": f"等待超时({self.timeout}秒)",
}
async def wait_for_clone(
self,
source_audio_url: str | None = None,
source_video_url: str | None = None,
video_id: str | None = None,
voice_name: str | None = None,
poll_interval: float = CLONE_POLL_INTERVAL,
) -> dict:
"""
一站式:提交克隆任务并等待完成。
Args:
source_audio_url: 源音频 URL
source_video_url: 源视频 URL
video_id: 历史作品ID
voice_name: 自定义音色名称
poll_interval: 轮询间隔
Returns:
最终任务状态
Raises:
ValueError: 提交失败
TimeoutError: 等待超时
"""
task_id = await self.submit_clone_task(
source_audio_url=source_audio_url,
source_video_url=source_video_url,
video_id=video_id,
voice_name=voice_name,
)
result = await self.query_clone_task(task_id, blocking=False)
status = result.get("status", "pending")
if status == CloneTaskStatus.SUCCEEDED.value:
logger.info(f"[VoiceClone] 克隆成功: task_id={task_id}")
return result
# 阻塞等待
result = await self._wait_for_completion(task_id, poll_interval=poll_interval)
return result
async def list_custom_voices(self) -> list[dict]:
"""
查询自定义音色列表。
Returns:
自定义音色列表
"""
return await self.provider.list_custom_voices()
async def delete_custom_voice(self, voice_id: str) -> bool:
"""
删除自定义音色。
Args:
voice_id: 音色 ID
Returns:
是否删除成功
"""
result = await self.provider.delete_custom_voice(voice_id)
return result.get("code") == 0
+10 -62
View File
@@ -1,98 +1,46 @@
services:
# PostgreSQL 数据库
db:
image: postgres:15-alpine
container_name: meijiaka-db
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
POSTGRES_DB: meijiaka
volumes:
- postgres_data:/var/lib/postgresql/data
ports:
- "5432:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 5s
timeout: 5s
retries: 5
networks:
- meijiaka-network
# Redis 缓存
redis:
image: redis:7-alpine
container_name: meijiaka-redis
volumes:
- redis_data:/data
ports:
- "6379:6379"
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 5s
retries: 5
networks:
- meijiaka-network
# FastAPI 应用(开发模式)
api:
build:
context: .
dockerfile: Dockerfile
container_name: meijiaka-api
container_name: meijiaka-zj-api
environment:
- ENV=development
- DEBUG=true
- DATABASE_URL=postgresql+asyncpg://postgres:postgres@db:5432/meijiaka
- DATABASE_URL=postgresql+asyncpg://postgres:postgres@db:5432/meijiaka_zj
- REDIS_HOST=redis
- REDIS_PORT=6379
- REDIS_DB=0
- REDIS_DB=1
- SECRET_KEY=dev-secret-key-change-in-production
volumes:
- .:/app
- ~/Documents/Meijiaka:/root/Documents/Meijiaka
- ~/Documents/Meijiaka-zj:/root/Documents/Meijiaka-zj
ports:
- "8080:8000"
depends_on:
db:
condition: service_healthy
redis:
condition: service_healthy
- "8081:8000"
command: uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
networks:
- meijiaka-network
# Async Engine Scheduler: 统一调度所有第三方异步任务
scheduler:
build:
context: .
dockerfile: Dockerfile
container_name: meijiaka-scheduler
container_name: meijiaka-zj-scheduler
environment:
- ENV=development
- DEBUG=true
- DATABASE_URL=postgresql+asyncpg://postgres:postgres@db:5432/meijiaka
- DATABASE_URL=postgresql+asyncpg://postgres:postgres@db:5432/meijiaka_zj
- REDIS_HOST=redis
- REDIS_PORT=6379
- REDIS_DB=0
- REDIS_DB=1
- SECRET_KEY=dev-secret-key-change-in-production
volumes:
- .:/app
- ~/Documents/Meijiaka:/root/Documents/Meijiaka
depends_on:
db:
condition: service_healthy
redis:
condition: service_healthy
- ~/Documents/Meijiaka-zj:/root/Documents/Meijiaka-zj
command: python -m app.scheduler.main
networks:
- meijiaka-network
volumes:
postgres_data:
redis_data:
networks:
meijiaka-network:
driver: bridge
external: true
+168
View File
@@ -0,0 +1,168 @@
# This file was autogenerated by uv via the following command:
# uv pip compile pyproject.toml -o requirements.lock
aiohappyeyeballs==2.6.1
# via aiohttp
aiohttp==3.13.5
# via meijiaka-ai-api (pyproject.toml)
aiosignal==1.4.0
# via aiohttp
annotated-doc==0.0.4
# via fastapi
annotated-types==0.7.0
# via pydantic
anyio==4.13.0
# via
# httpx
# openai
# starlette
# volcengine-python-sdk
# watchfiles
asyncpg==0.30.0
# via meijiaka-ai-api (pyproject.toml)
attrs==26.1.0
# via aiohttp
bcrypt==4.2.1
# via
# meijiaka-ai-api (pyproject.toml)
# passlib
certifi==2026.2.25
# via
# httpcore
# httpx
# requests
# volcengine-python-sdk
cffi==2.0.0
# via cryptography
charset-normalizer==3.4.7
# via requests
click==8.3.2
# via uvicorn
cryptography==46.0.6
# via
# python-jose
# volcengine-python-sdk
distro==1.9.0
# via openai
ecdsa==0.19.2
# via python-jose
fastapi==0.135.3
# via meijiaka-ai-api (pyproject.toml)
frozenlist==1.8.0
# via
# aiohttp
# aiosignal
greenlet==3.3.2
# via sqlalchemy
h11==0.16.0
# via
# httpcore
# uvicorn
httpcore==1.0.9
# via httpx
httptools==0.7.1
# via uvicorn
httpx==0.28.1
# via
# meijiaka-ai-api (pyproject.toml)
# openai
# volcengine-python-sdk
idna==3.11
# via
# anyio
# httpx
# requests
# yarl
jiter==0.13.0
# via openai
multidict==6.7.1
# via
# aiohttp
# yarl
openai==1.58.1
# via meijiaka-ai-api (pyproject.toml)
orjson==3.11.8
# via meijiaka-ai-api (pyproject.toml)
passlib==1.7.4
# via meijiaka-ai-api (pyproject.toml)
propcache==0.4.1
# via
# aiohttp
# yarl
pyasn1==0.4.8
# via
# python-jose
# rsa
pycparser==3.0
# via cffi
pydantic==2.9.2
# via
# meijiaka-ai-api (pyproject.toml)
# fastapi
# openai
# pydantic-settings
# volcengine-python-sdk
pydantic-core==2.23.4
# via pydantic
pydantic-settings==2.6.1
# via meijiaka-ai-api (pyproject.toml)
python-dateutil==2.9.0.post0
# via volcengine-python-sdk
python-dotenv==1.2.2
# via
# pydantic-settings
# uvicorn
python-jose==3.4.0
# via meijiaka-ai-api (pyproject.toml)
python-multipart==0.0.24
# via meijiaka-ai-api (pyproject.toml)
pyyaml==6.0.3
# via
# meijiaka-ai-api (pyproject.toml)
# uvicorn
qiniu==7.13.2
# via meijiaka-ai-api (pyproject.toml)
redis==5.2.1
# via meijiaka-ai-api (pyproject.toml)
requests==2.33.1
# via qiniu
rsa==4.9.1
# via python-jose
six==1.17.0
# via
# ecdsa
# python-dateutil
# volcengine-python-sdk
sniffio==1.3.1
# via openai
sqlalchemy==2.0.49
# via meijiaka-ai-api (pyproject.toml)
starlette==1.0.0
# via fastapi
tqdm==4.67.3
# via openai
typing-extensions==4.15.0
# via
# fastapi
# openai
# pydantic
# pydantic-core
# sqlalchemy
# typing-inspection
typing-inspection==0.4.2
# via fastapi
urllib3==2.6.3
# via
# requests
# volcengine-python-sdk
uvicorn==0.32.1
# via meijiaka-ai-api (pyproject.toml)
uvloop==0.22.1
# via uvicorn
volcengine-python-sdk==5.0.22
# via meijiaka-ai-api (pyproject.toml)
watchfiles==1.1.1
# via uvicorn
websockets==16.0
# via uvicorn
yarl==1.23.0
# via aiohttp
+8
View File
@@ -0,0 +1,8 @@
"""
pytest 配置文件
"""
import sys
from pathlib import Path
# 将 app 目录添加到 Python 路径
sys.path.insert(0, str(Path(__file__).parent.parent))
+299
View File
@@ -0,0 +1,299 @@
"""
Kling TTS 服务单元测试
"""
import pytest
from unittest.mock import AsyncMock, patch, MagicMock
import httpx
class TestTTSService:
"""TTS 服务测试"""
@pytest.fixture
def mock_provider(self):
"""创建模拟的 KlingAIProvider"""
mock = MagicMock()
mock.generate_tts = AsyncMock(return_value={"task_id": "test-task-123"})
mock.get_tts_task = AsyncMock(return_value={
"status": "succeed",
"task_result": {"audio_url": "https://example.com/audio.mp3"}
})
return mock
@pytest.fixture
def tts_service(self, mock_provider):
"""创建 TTSService 实例(带 mock provider"""
from app.services.tts_service import TTSService
service = TTSService.__new__(TTSService) # 不调用 __init__
service.provider = mock_provider
service.default_voice_id = "829826751244537879"
return service
def test_tts_service_init(self, tts_service):
"""测试 TTSService 初始化"""
assert tts_service is not None
assert hasattr(tts_service, "provider")
assert hasattr(tts_service, "default_voice_id")
def test_preset_voices_loaded(self, tts_service):
"""测试预设音色已加载"""
voices = tts_service.get_preset_voices()
assert len(voices) > 0
assert any(v["voice_id"] == "829826751244537879" for v in voices)
def test_get_voice_by_id_found(self, tts_service):
"""测试根据 ID 获取音色 - 找到"""
voice = tts_service.get_voice_by_id("829826751244537879")
assert voice is not None
assert voice["name"] == "温柔女声"
def test_get_voice_by_id_not_found(self, tts_service):
"""测试根据 ID 获取音色 - 未找到"""
voice = tts_service.get_voice_by_id("non-existent-id")
assert voice is None
@pytest.mark.asyncio
async def test_synthesize_sync_success(self, tts_service):
"""测试同步合成成功"""
with patch.object(tts_service.provider, "generate_tts", new_callable=AsyncMock) as mock_generate:
mock_generate.return_value = {"task_id": "test-task-123"}
with patch.object(tts_service.provider, "get_tts_task", new_callable=AsyncMock) as mock_get:
# 第一次调用:pending
mock_get.return_value = {"status": "pending"}
# 第二次调用:succeed
mock_get.return_value = {
"status": "succeed",
"task_result": {"audio_url": "https://kling.example.com/audio.mp3"}
}
result = await tts_service.synthesize_sync(
text="你好世界",
voice_id="829826751244537879",
speed=1.0,
voice_language="zh"
)
assert result == "https://kling.example.com/audio.mp3"
@pytest.mark.asyncio
async def test_synthesize_sync_empty_text(self, tts_service):
"""测试空文本"""
with pytest.raises(ValueError, match="text 不能为空"):
await tts_service.synthesize_sync(text="", voice_id="test")
@pytest.mark.asyncio
async def test_synthesize_sync_text_too_long(self, tts_service):
"""测试文本超长(验证在 API 调用之前)"""
# 中文每个字是 1 个字符,"测" * 1200 = 1200 个字符
long_text = "" * 1201 # 超过 1000
with pytest.raises(ValueError, match="text 不能超过"):
await tts_service.synthesize_sync(text=long_text, voice_id="test")
@pytest.mark.asyncio
async def test_synthesize_sync_no_task_id(self, tts_service):
"""测试提交任务未返回 task_id"""
with patch.object(tts_service.provider, "generate_tts", new_callable=AsyncMock) as mock_generate:
mock_generate.return_value = {} # 没有 task_id
with pytest.raises(ValueError, match="任务提交失败"):
await tts_service.synthesize_sync(text="你好", voice_id="test")
@pytest.mark.asyncio
async def test_synthesize_sync_task_failed(self, tts_service):
"""测试任务失败"""
with patch.object(tts_service.provider, "generate_tts", new_callable=AsyncMock) as mock_generate:
mock_generate.return_value = {"task_id": "test-task-123"}
with patch.object(tts_service.provider, "get_tts_task", new_callable=AsyncMock) as mock_get:
mock_get.return_value = {"status": "failed", "message": "服务器错误"}
with pytest.raises(ValueError, match="任务失败"):
await tts_service.synthesize_sync(text="你好", voice_id="test")
def test_get_preset_voices_static(self):
"""测试静态方法获取预设音色"""
from app.services.tts_service import TTSService
voices = TTSService.get_preset_voices()
assert len(voices) == 5
# 验证默认音色存在
assert any(v["voice_id"] == "829826751244537879" for v in voices)
class TestVoiceCloneService:
"""声音克隆服务测试"""
@pytest.fixture
def clone_service(self):
"""创建 VoiceCloneService 实例"""
from app.services.voice_clone_service import VoiceCloneService
return VoiceCloneService()
def test_voice_clone_service_init(self, clone_service):
"""测试 VoiceCloneService 初始化"""
assert clone_service is not None
assert hasattr(clone_service, "provider")
assert hasattr(clone_service, "timeout")
@pytest.mark.asyncio
async def test_submit_clone_with_audio_url(self, clone_service):
"""测试使用音频 URL 提交克隆任务"""
with patch.object(clone_service.provider, "create_custom_voice", new_callable=AsyncMock) as mock_create:
mock_create.return_value = {"task_id": "clone-task-123"}
task_id = await clone_service.submit_clone_task(
source_audio_url="https://example.com/source.mp3",
voice_name="我的克隆音色"
)
assert task_id == "clone-task-123"
mock_create.assert_called_once()
call_kwargs = mock_create.call_args.kwargs
assert call_kwargs["audio_url"] == "https://example.com/source.mp3"
assert call_kwargs["voice_name"] == "我的克隆音色"
@pytest.mark.asyncio
async def test_submit_clone_with_video_url(self, clone_service):
"""测试使用视频 URL 提交克隆任务"""
with patch.object(clone_service.provider, "create_custom_voice", new_callable=AsyncMock) as mock_create:
mock_create.return_value = {"task_id": "clone-task-456"}
task_id = await clone_service.submit_clone_task(
source_video_url="https://example.com/source.mp4",
voice_name="视频音色"
)
assert task_id == "clone-task-456"
mock_create.assert_called_once()
call_kwargs = mock_create.call_args.kwargs
assert call_kwargs["video_url"] == "https://example.com/source.mp4"
@pytest.mark.asyncio
async def test_submit_clone_no_source(self, clone_service):
"""测试没有提供任何来源时抛出异常"""
with pytest.raises(ValueError, match="必须提供"):
await clone_service.submit_clone_task()
@pytest.mark.asyncio
async def test_submit_clone_invalid_audio_url(self, clone_service):
"""测试无效的音频 URL"""
with pytest.raises(ValueError, match="有效的 URL"):
await clone_service.submit_clone_task(source_audio_url="not-a-url")
@pytest.mark.asyncio
async def test_submit_clone_voice_name_too_long(self, clone_service):
"""测试音色名称过长"""
with pytest.raises(ValueError, match="不能超过"):
await clone_service.submit_clone_task(
source_audio_url="https://example.com/audio.mp3",
voice_name="a" * 25 # 超过 20 字符
)
@pytest.mark.asyncio
async def test_submit_clone_no_task_id(self, clone_service):
"""测试提交任务未返回 task_id"""
with patch.object(clone_service.provider, "create_custom_voice", new_callable=AsyncMock) as mock_create:
mock_create.return_value = {} # 没有 task_id
with pytest.raises(ValueError, match="未返回 task_id"):
await clone_service.submit_clone_task(
source_audio_url="https://example.com/audio.mp3"
)
@pytest.mark.asyncio
async def test_query_clone_task_success(self, clone_service):
"""测试查询克隆任务状态 - 成功"""
with patch.object(clone_service.provider, "get_custom_voice_task", new_callable=AsyncMock) as mock_get:
mock_get.return_value = {
"task_status": "succeed",
"task_result": {
"voices": [
{"voice_id": "custom-voice-001", "trial_url": "https://example.com/trial.mp3"}
]
}
}
result = await clone_service.query_clone_task("clone-task-123")
assert result["task_id"] == "clone-task-123"
assert result["status"] == "succeeded"
assert result["voice_id"] == "custom-voice-001"
assert result["trial_url"] == "https://example.com/trial.mp3"
@pytest.mark.asyncio
async def test_query_clone_task_pending(self, clone_service):
"""测试查询克隆任务状态 - 待处理"""
with patch.object(clone_service.provider, "get_custom_voice_task", new_callable=AsyncMock) as mock_get:
mock_get.return_value = {"task_status": "pending"}
result = await clone_service.query_clone_task("clone-task-123")
assert result["task_id"] == "clone-task-123"
assert result["status"] == "pending"
assert result["voice_id"] is None
@pytest.mark.asyncio
async def test_query_clone_task_failed(self, clone_service):
"""测试查询克隆任务状态 - 失败"""
with patch.object(clone_service.provider, "get_custom_voice_task", new_callable=AsyncMock) as mock_get:
mock_get.return_value = {"task_status": "failed", "message": "音频格式不支持"}
result = await clone_service.query_clone_task("clone-task-123")
assert result["status"] == "failed"
assert "音频格式不支持" in result["error_message"]
@pytest.mark.asyncio
async def test_delete_custom_voice_success(self, clone_service):
"""测试删除自定义音色 - 成功"""
with patch.object(clone_service.provider, "delete_custom_voice", new_callable=AsyncMock) as mock_delete:
mock_delete.return_value = {"code": 0, "message": "success"}
result = await clone_service.delete_custom_voice("custom-voice-001")
assert result is True
mock_delete.assert_called_once_with("custom-voice-001")
@pytest.mark.asyncio
async def test_delete_custom_voice_failure(self, clone_service):
"""测试删除自定义音色 - 失败"""
with patch.object(clone_service.provider, "delete_custom_voice", new_callable=AsyncMock) as mock_delete:
mock_delete.return_value = {"code": 400, "message": "Invalid voice id"}
result = await clone_service.delete_custom_voice("non-existent")
assert result is False
@pytest.mark.asyncio
async def test_wait_for_clone_immediate_success(self, clone_service):
"""测试立即克隆成功"""
with patch.object(clone_service.provider, "create_custom_voice", new_callable=AsyncMock) as mock_create:
mock_create.return_value = {"task_id": "clone-task-123"}
with patch.object(clone_service.provider, "get_custom_voice_task", new_callable=AsyncMock) as mock_get:
# 第一次查询就成功
mock_get.return_value = {
"task_status": "succeed",
"task_result": {"voices": [{"voice_id": "v1"}]}
}
result = await clone_service.wait_for_clone(
source_audio_url="https://example.com/audio.mp3"
)
assert result["status"] == "succeeded"
assert result["voice_id"] == "v1"
class TestCloneTaskStatus:
"""克隆任务状态枚举测试"""
def test_clone_task_status_values(self):
"""测试状态枚举值"""
from app.services.voice_clone_service import CloneTaskStatus
assert CloneTaskStatus.PENDING.value == "pending"
assert CloneTaskStatus.PROCESSING.value == "processing"
assert CloneTaskStatus.SUCCEEDED.value == "succeeded"
assert CloneTaskStatus.FAILED.value == "failed"
assert CloneTaskStatus.TIMEOUT.value == "timeout"
+2096
View File
File diff suppressed because it is too large Load Diff
+1 -1
View File
@@ -1,3 +1,3 @@
# Vite 开发环境变量
# 前端默认连接的 Python API 地址
VITE_API_BASE_URL=http://127.0.0.1:8080/api/v1
VITE_API_BASE_URL=http://127.0.0.1:8081/api/v1
-19
View File
@@ -1,19 +0,0 @@
[Script Info]
; Script generated by Meijiaka AI Video
; Style: 抖音美好体 - 抖音风格字幕
ScriptType: v4.00+
PlayResX: 1920
PlayResY: 1080
ScaledBorderAndShadow: yes
Video Aspect Ratio: 0
Video Rate: 25
Audio Rate: 48000
[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Douyin-Diamond,抖音美好体,72,&H00FFFFFF,&H00000000,&H00141414,&H00000000,1,0,0,0,100,100,0,0,1,3,0,2,20,20,80,1
Style: Douyin-Bold,抖音美好体,64,&H00FFFFFF,&H00000000,&H00141414,&H00000000,1,0,0,0,100,100,0,0,1,2,0,2,20,20,60,1
Style: Douyin-Small,抖音美好体,54,&H00EEEEEE,&H00000000,&H00141414,&H00000000,1,0,0,0,100,100,0,0,1,2,0,2,20,20,40,1
[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
+1 -1
View File
@@ -5,7 +5,7 @@
<meta charset="UTF-8" />
<link rel="icon" type="image/svg+xml" href="/vite.svg" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>美家卡 智</title>
<title>美家卡 智</title>
<style>
@font-face {
font-family: 'DouyinSans';
+6096
View File
File diff suppressed because it is too large Load Diff
@@ -9,7 +9,7 @@
{
"identifier": "opener:allow-open-path",
"allow": [
{ "path": "$DOCUMENT/Meijiaka/**" },
{ "path": "$DOCUMENT/Meijiaka-zj/**" },
{ "path": "$DOCUMENT/**" },
{ "path": "$APPLOCALDATA/**" },
{ "path": "$APPDATA/**" },
+1
View File
@@ -9,3 +9,4 @@ pub mod auth_state;
pub mod avatar;
pub mod product;
pub mod project;
pub mod voice;
+111
View File
@@ -0,0 +1,111 @@
//! Voice 音频管理 IPC 命令
use crate::ApiResponse;
use crate::storage::voice as voice_storage;
#[derive(serde::Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct SaveAudioArgs {
pub project_id: String,
pub audio_id: String,
pub audio_data: String,
pub name: String,
pub voice_id: String,
pub duration: f64,
}
/// 保存音频文件(前端传入 base64 编码)
#[tauri::command]
pub async fn save_audio(
args: SaveAudioArgs,
) -> ApiResponse<voice_storage::AudioMeta> {
let audio_bytes = match base64::Engine::decode(
&base64::engine::general_purpose::STANDARD,
&args.audio_data,
) {
Ok(data) => data,
Err(e) => return ApiResponse {
code: 400,
message: format!("Invalid base64 data: {}", e),
data: None,
},
};
match voice_storage::save_audio_file(
&args.project_id,
&args.audio_id,
&audio_bytes,
&args.name,
&args.voice_id,
args.duration,
) {
Ok(meta) => ApiResponse {
code: 200,
message: "Audio saved successfully".to_string(),
data: Some(meta),
},
Err(e) => ApiResponse {
code: 500,
message: format!("Failed to save audio: {}", e),
data: None,
},
}
}
/// 列出项目所有音频文件
#[tauri::command]
pub async fn list_project_audios(
project_id: String,
) -> ApiResponse<Vec<voice_storage::AudioMeta>> {
match voice_storage::list_project_audios(&project_id) {
Ok(audios) => ApiResponse {
code: 200,
message: "Audio list retrieved".to_string(),
data: Some(audios),
},
Err(e) => ApiResponse {
code: 500,
message: format!("Failed to list audios: {}", e),
data: Some(vec![]),
},
}
}
/// 删除音频文件
#[tauri::command]
pub async fn delete_audio(
project_id: String,
audio_id: String,
) -> ApiResponse<bool> {
match voice_storage::delete_audio_file(&project_id, &audio_id) {
Ok(_) => ApiResponse {
code: 200,
message: "Audio deleted successfully".to_string(),
data: Some(true),
},
Err(e) => ApiResponse {
code: 500,
message: format!("Failed to delete audio: {}", e),
data: None,
},
}
}
/// 获取项目音频目录(供前端 FFmpeg 调用)
#[tauri::command]
pub async fn get_project_audios_dir(
project_id: String,
) -> ApiResponse<String> {
match voice_storage::get_project_audios_dir(&project_id) {
Ok(path) => ApiResponse {
code: 200,
message: "Audio directory path".to_string(),
data: Some(path.to_string_lossy().to_string()),
},
Err(e) => ApiResponse {
code: 500,
message: format!("Failed to get audio dir: {}", e),
data: None,
},
}
}
+248 -38
View File
@@ -2,6 +2,64 @@ use tauri_plugin_shell::ShellExt;
use tauri_plugin_shell::process::CommandEvent;
use tauri::{AppHandle, Emitter};
/// FFmpeg 路径转义(替换 `:` 为 `\:`,替换 `'` 为 `'\''`
fn escape_ffmpeg_path(path: &str) -> String {
path.replace("'", "'\\''")
}
/// 验证路径在允许的目录内,防止路径遍历攻击
/// 允许的目录:用户文档目录下的 Meijiaka-zj 文件夹
fn validate_safe_path(path: &str) -> Result<String, String> {
let path = std::path::Path::new(path);
// 获取绝对路径
let abs_path = if path.is_absolute() {
path.to_path_buf()
} else {
std::env::current_dir()
.map_err(|e| format!("无法获取当前目录: {}", e))?
.join(path)
};
// 检查是否在允许的目录内
let docs_dir = dirs::document_dir()
.ok_or("无法获取文档目录")?;
let allowed_dir = docs_dir.join("Meijiaka-zj");
// 规范化路径
let canonical = abs_path.canonicalize()
.unwrap_or(abs_path.clone());
// 检查是否在允许目录下
if !canonical.starts_with(&allowed_dir) {
return Err(format!("路径不在允许目录内: {}", path.display()));
}
Ok(canonical.to_string_lossy().to_string())
}
/// 清理并验证输出路径
fn sanitize_output_path(path: &str) -> Result<String, String> {
let path = std::path::Path::new(path);
// 获取父目录并验证
if let Some(parent) = path.parent() {
validate_safe_path(&parent.to_string_lossy())?;
}
// 确保是绝对路径
if path.is_absolute() {
Ok(path.to_string_lossy().to_string())
} else {
std::env::current_dir()
.map_err(|e| format!("无法获取当前目录: {}", e))?
.join(path)
.to_str()
.map(|s| s.to_string())
.ok_or_else(|| "无效的路径".to_string())
}
}
/**
* FFmpeg Sidecar
*/
@@ -50,8 +108,12 @@ pub async fn run_ffmpeg(app: &AppHandle, args: Vec<String>) -> Result<String, St
* ( 1080:1920, 30fps, libx264, aac 44100Hz stereo)
*/
pub async fn standardize_video(app: &AppHandle, input_path: &str, output_path: &str) -> Result<(), String> {
// 验证路径安全
let safe_input = validate_safe_path(input_path)?;
let safe_output = sanitize_output_path(output_path)?;
let args = vec![
"-i".to_string(), input_path.to_string(),
"-i".to_string(), safe_input,
"-vf".to_string(), "fps=30,scale=1080:1920:force_original_aspect_ratio=decrease,pad=1080:1920:(ow-iw)/2:(oh-ih)/2,format=yuv420p".to_string(),
"-c:v".to_string(), "libx264".to_string(),
"-c:a".to_string(), "aac".to_string(),
@@ -61,7 +123,7 @@ pub async fn standardize_video(app: &AppHandle, input_path: &str, output_path: &
"-crf".to_string(), "23".to_string(),
"-r".to_string(), "30".to_string(),
"-y".to_string(),
output_path.to_string()
safe_output
];
run_ffmpeg(app, args).await.map(|_| ())
}
@@ -70,13 +132,17 @@ pub async fn standardize_video(app: &AppHandle, input_path: &str, output_path: &
* - (/)
*/
pub async fn concat_videos_copy(app: &AppHandle, list_path: &str, output_path: &str) -> Result<(), String> {
// 验证路径安全
let safe_list = validate_safe_path(list_path)?;
let safe_output = sanitize_output_path(output_path)?;
let args = vec![
"-f".to_string(), "concat".to_string(),
"-safe".to_string(), "0".to_string(),
"-i".to_string(), list_path.to_string(),
"-i".to_string(), safe_list,
"-c".to_string(), "copy".to_string(),
"-y".to_string(),
output_path.to_string()
safe_output
];
run_ffmpeg(app, args).await.map(|_| ())
}
@@ -93,7 +159,7 @@ pub async fn concat_videos_robust(app: &AppHandle, video_paths: Vec<String>, out
let mut standardized_paths = Vec::new();
// 1. 标准化每个片段
// 1. 标准化每个片段(内部已验证路径)
for (i, path) in video_paths.iter().enumerate() {
let std_path = temp_dir.join(format!("std_{}_{}.mp4", timestamp, i));
standardize_video(app, path, std_path.to_str().unwrap()).await?;
@@ -104,11 +170,12 @@ pub async fn concat_videos_robust(app: &AppHandle, video_paths: Vec<String>, out
let list_path = temp_dir.join(format!("concat_list_{}.txt", timestamp));
let mut list_content = String::new();
for path in &standardized_paths {
list_content.push_str(&format!("file '{}'\n", path.to_str().unwrap()));
// FFmpeg concat 列表中的路径需要用单引号包裹
list_content.push_str(&format!("file '{}'\n", escape_ffmpeg_path(&path.to_string_lossy())));
}
std::fs::write(&list_path, list_content).map_err(|e| e.to_string())?;
// 3. 执行快速拼接
// 3. 执行快速拼接(输出路径会被验证)
let concat_res = concat_videos_copy(app, list_path.to_str().unwrap(), output_path).await;
// 4. 清理临时文件
@@ -124,9 +191,14 @@ pub async fn concat_videos_robust(app: &AppHandle, video_paths: Vec<String>, out
* -
*/
pub async fn add_audio_to_video(app: &AppHandle, video_path: &str, audio_path: &str, output_path: &str) -> Result<(), String> {
// 验证路径安全
let safe_video = validate_safe_path(video_path)?;
let safe_audio = validate_safe_path(audio_path)?;
let safe_output = sanitize_output_path(output_path)?;
let args = vec![
"-i".to_string(), video_path.to_string(),
"-i".to_string(), audio_path.to_string(),
"-i".to_string(), safe_video,
"-i".to_string(), safe_audio,
"-c:v".to_string(), "copy".to_string(),
"-c:a".to_string(), "aac".to_string(),
"-b:a".to_string(), "192k".to_string(), // 提高码率
@@ -135,7 +207,7 @@ pub async fn add_audio_to_video(app: &AppHandle, video_path: &str, audio_path: &
"-map".to_string(), "1:a:0".to_string(),
"-shortest".to_string(),
"-y".to_string(),
output_path.to_string()
safe_output
];
run_ffmpeg(app, args).await.map(|_| ())
}
@@ -145,9 +217,13 @@ pub async fn add_audio_to_video(app: &AppHandle, video_path: &str, audio_path: &
* concat
*/
pub async fn create_cover_video(app: &AppHandle, input_path: &str, output_path: &str, duration: &str) -> Result<(), String> {
// 验证路径安全
let safe_input = validate_safe_path(input_path)?;
let safe_output = sanitize_output_path(output_path)?;
let args = vec![
"-loop".to_string(), "1".to_string(),
"-i".to_string(), input_path.to_string(),
"-i".to_string(), safe_input,
"-f".to_string(), "lavfi".to_string(),
"-i".to_string(), "anullsrc=r=44100:cl=stereo".to_string(),
"-c:v".to_string(), "libx264".to_string(),
@@ -158,7 +234,7 @@ pub async fn create_cover_video(app: &AppHandle, input_path: &str, output_path:
"-r".to_string(), "30".to_string(),
"-shortest".to_string(),
"-y".to_string(),
output_path.to_string()
safe_output
];
run_ffmpeg(app, args).await.map(|_| ())
}
@@ -193,7 +269,7 @@ fn get_fonts_dir(app: &AppHandle) -> Result<std::path::PathBuf, String> {
return Ok(dev_fonts_path);
}
// 如果还是找不到,尝试往上一级找
// 如果还是找不到,尝试往上一级找(支持项目根目录结构)
if let Some(parent) = cwd.parent() {
let dev_fonts_path_alt = parent.join("src-tauri/fonts");
println!("[get_fonts_dir] Checking dev parent path: {:?}", dev_fonts_path_alt);
@@ -201,13 +277,16 @@ fn get_fonts_dir(app: &AppHandle) -> Result<std::path::PathBuf, String> {
println!("[get_fonts_dir] Found fonts at: {:?}", dev_fonts_path_alt);
return Ok(dev_fonts_path_alt);
}
}
// 尝试绝对路径
let abs_path = std::path::Path::new("/Users/0fun/work/ai-meijiaka/tauri-app/src-tauri/fonts");
if abs_path.exists() {
println!("[get_fonts_dir] Found fonts at absolute path: {:?}", abs_path);
return Ok(abs_path.to_path_buf());
// 再往上一级(如果 cwd 是 src-tauri/src
if let Some(grandparent) = parent.parent() {
let dev_fonts_path_grand = grandparent.join("tauri-app/src-tauri/fonts");
println!("[get_fonts_dir] Checking grandparent path: {:?}", dev_fonts_path_grand);
if dev_fonts_path_grand.exists() {
println!("[get_fonts_dir] Found fonts at: {:?}", dev_fonts_path_grand);
return Ok(dev_fonts_path_grand);
}
}
}
Err("Could not find fonts directory in any location".to_string())
@@ -221,13 +300,17 @@ pub async fn extract_first_frame(
video_path: &str,
output_path: &str,
) -> Result<(), String> {
// 验证路径安全
let safe_video = validate_safe_path(video_path)?;
let safe_output = sanitize_output_path(output_path)?;
let args = vec![
"-i".to_string(), video_path.to_string(),
"-i".to_string(), safe_video,
"-ss".to_string(), "00:00:00".to_string(),
"-vframes".to_string(), "1".to_string(),
"-q:v".to_string(), "2".to_string(),
"-y".to_string(),
output_path.to_string(),
safe_output,
];
run_ffmpeg(app, args).await.map(|_| ())
}
@@ -243,14 +326,17 @@ pub async fn burn_ass_subtitle(
ass_path: &str,
output_path: &str,
) -> Result<(), String> {
// 验证路径安全
let safe_video = validate_safe_path(video_path)?;
let safe_ass = validate_safe_path(ass_path)?;
let safe_output = sanitize_output_path(output_path)?;
let fonts_dir = get_fonts_dir(app)?;
let fonts_dir_str = fonts_dir.to_str().ok_or("Invalid fonts dir path")?;
// 转义路径中的特殊字符(FFmpeg 滤镜语法要求)
// 只需要转义冒号 : 因为它是滤镜参数分隔符
// FFmpeg 中,整个引用路径不需要内部再转义单引号
let ass_path_escaped = ass_path.replace(":", "\\:");
let fonts_dir_escaped = fonts_dir_str.replace(":", "\\:");
// 使用统一的路径转义函数处理 FFmpeg 滤镜中的路径
let ass_path_escaped = escape_ffmpeg_path(&safe_ass);
let fonts_dir_escaped = escape_ffmpeg_path(fonts_dir_str);
// 使用 ass 滤镜,指定字体目录
// 语法: ass='path':fontsdir='path'
@@ -259,14 +345,14 @@ pub async fn burn_ass_subtitle(
println!("FFmpeg filter: {}", filter); // 调试日志
let args = vec![
"-i".to_string(), video_path.to_string(),
"-i".to_string(), safe_video,
"-vf".to_string(), filter,
"-c:v".to_string(), "libx264".to_string(),
"-preset".to_string(), "medium".to_string(),
"-crf".to_string(), "23".to_string(),
"-c:a".to_string(), "copy".to_string(), // 音频直接复制
"-y".to_string(),
output_path.to_string()
safe_output
];
run_ffmpeg(app, args).await.map(|_| ())
@@ -283,22 +369,140 @@ pub async fn burn_ass_subtitle_with_fonts(
fonts_dir: &str,
output_path: &str,
) -> Result<(), String> {
let filter = format!("ass='{}':fontsdir='{}'", ass_path, fonts_dir);
// 验证路径安全
let safe_video = validate_safe_path(video_path)?;
let safe_ass = validate_safe_path(ass_path)?;
let safe_output = sanitize_output_path(output_path)?;
let filter = format!(
"ass='{}':fontsdir='{}'",
escape_ffmpeg_path(&safe_ass),
escape_ffmpeg_path(fonts_dir)
);
let args = vec![
"-i".to_string(), video_path.to_string(),
"-i".to_string(), safe_video,
"-vf".to_string(), filter,
"-c:v".to_string(), "libx264".to_string(),
"-preset".to_string(), "medium".to_string(),
"-crf".to_string(), "23".to_string(),
"-c:a".to_string(), "copy".to_string(),
"-y".to_string(),
output_path.to_string()
safe_output
];
run_ffmpeg(app, args).await.map(|_| ())
}
/**
*
*
* TTS/
*/
pub async fn replace_audio_track(
app: &AppHandle,
video_path: &str,
audio_path: &str,
output_path: &str,
) -> Result<(), String> {
// 验证路径安全
let safe_video = validate_safe_path(video_path)?;
let safe_audio = validate_safe_path(audio_path)?;
let safe_output = sanitize_output_path(output_path)?;
let args = vec![
"-i".to_string(), safe_video,
"-i".to_string(), safe_audio,
// 视频流:直接复制(不重新编码)
"-c:v".to_string(), "copy".to_string(),
// 音频流:转码为 AAC
"-c:a".to_string(), "aac".to_string(),
"-b:a".to_string(), "192k".to_string(),
"-ar".to_string(), "44100".to_string(),
"-ac".to_string(), "2".to_string(),
// 只保留第一个视频流和第一个音频流
"-map".to_string(), "0:v:0".to_string(),
"-map".to_string(), "1:a:0".to_string(),
"-shortest".to_string(),
"-y".to_string(),
safe_output,
];
run_ffmpeg(app, args).await.map(|_| ())
}
/**
*
*
* +
* 0dB 6dB
*/
pub async fn mix_audio_tracks(
app: &AppHandle,
voice_path: &str,
bgm_path: &str,
output_path: &str,
voice_volume: f64,
bgm_volume: f64,
) -> Result<(), String> {
// 验证路径安全
let safe_voice = validate_safe_path(voice_path)?;
let safe_bgm = validate_safe_path(bgm_path)?;
let safe_output = sanitize_output_path(output_path)?;
// 限制音量范围,防止命令注入
// volume 参数只接受 0.0 - 10.0 范围的浮点数
let _safe_voice_vol = voice_volume.clamp(0.0, 10.0); // voice_volume 当前未使用,保留供将来扩展
let safe_bgm_vol = bgm_volume.clamp(0.0, 10.0);
// 格式化为安全的数值字符串(最多2位小数)
let bgm_vol_str = format!("{:.2}", safe_bgm_vol);
// 构建安全的 filter 字符串
// FFmpeg volume 滤镜接受数值参数,这里使用格式化后的数值
let filter = format!(
"[1:a]volume={}[bgm];[0:a][bgm]amix=inputs=2:duration=first:dropout_transition=2[out]",
bgm_vol_str
);
let args = vec![
"-i".to_string(), safe_voice,
"-i".to_string(), safe_bgm,
"-filter_complex".to_string(), filter,
"-map".to_string(), "0:v:0".to_string(),
"-map".to_string(), "[out]".to_string(),
"-c:v".to_string(), "copy".to_string(),
"-c:a".to_string(), "aac".to_string(),
"-ar".to_string(), "44100".to_string(),
"-y".to_string(),
safe_output,
];
run_ffmpeg(app, args).await.map(|_| ())
}
/**
* (MP3 44.1kHz stereo 192kbps)
*/
pub async fn standardize_audio(
app: &AppHandle,
input_path: &str,
output_path: &str,
) -> Result<(), String> {
// 验证路径安全
let safe_input = validate_safe_path(input_path)?;
let safe_output = sanitize_output_path(output_path)?;
let args = vec![
"-i".to_string(), safe_input,
"-c:a".to_string(), "libmp3lame".to_string(),
"-b:a".to_string(), "192k".to_string(),
"-ar".to_string(), "44100".to_string(),
"-ac".to_string(), "2".to_string(),
"-y".to_string(),
safe_output,
];
run_ffmpeg(app, args).await.map(|_| ())
}
/**
* ASS 使
*
@@ -310,22 +514,28 @@ pub async fn burn_ass_subtitle_to_image(
ass_path: &str,
output_path: &str,
) -> Result<(), String> {
// 验证路径安全
let safe_image = validate_safe_path(image_path)?;
let safe_ass = validate_safe_path(ass_path)?;
let safe_output = sanitize_output_path(output_path)?;
let fonts_dir = get_fonts_dir(app)?;
let fonts_dir_str = fonts_dir.to_str().ok_or("Invalid fonts dir path")?;
let ass_path_escaped = ass_path.replace("'", "'\\''").replace(":", "\\:");
let fonts_dir_escaped = fonts_dir_str.replace("'", "'\\''");
let filter = format!("ass='{}':fontsdir='{}'", ass_path_escaped, fonts_dir_escaped);
let filter = format!(
"ass='{}':fontsdir='{}'",
escape_ffmpeg_path(&safe_ass),
escape_ffmpeg_path(fonts_dir_str)
);
let args = vec![
"-loop".to_string(), "1".to_string(),
"-i".to_string(), image_path.to_string(),
"-i".to_string(), safe_image,
"-vf".to_string(), filter,
"-t".to_string(), "1".to_string(),
"-frames:v".to_string(), "1".to_string(),
"-y".to_string(),
output_path.to_string()
safe_output
];
run_ffmpeg(app, args).await.map(|_| ())
+111
View File
@@ -109,6 +109,15 @@ pub fn run() {
commands::product::delete_local_product,
// 重命名成品
commands::product::rename_local_product,
// 音频管理
commands::voice::save_audio,
commands::voice::list_project_audios,
commands::voice::delete_audio,
commands::voice::get_project_audios_dir,
// 音频处理
replace_audio_track,
mix_audio_tracks,
standardize_audio,
])
.run(tauri::generate_context!())
.expect("error while running tauri application");
@@ -216,6 +225,108 @@ async fn generate_cover_image(
}
}
// ============================================================
// 音频处理命令
// ============================================================
#[derive(Deserialize)]
#[serde(rename_all = "snake_case")]
pub struct ReplaceAudioRequest {
video_path: String,
audio_path: String,
output_path: String,
}
#[tauri::command]
async fn replace_audio_track(
app: tauri::AppHandle,
request: ReplaceAudioRequest,
) -> ApiResponse<String> {
match ffmpeg_cmd::replace_audio_track(
&app,
&request.video_path,
&request.audio_path,
&request.output_path,
).await {
Ok(_) => ApiResponse {
code: 200,
message: "Audio replaced successfully".to_string(),
data: Some(request.output_path),
},
Err(e) => ApiResponse {
code: 500,
message: format!("Failed to replace audio: {}", e),
data: None,
},
}
}
#[derive(Deserialize)]
#[serde(rename_all = "snake_case")]
pub struct MixAudioRequest {
voice_path: String,
bgm_path: String,
output_path: String,
voice_volume: f64,
bgm_volume: f64,
}
#[tauri::command]
async fn mix_audio_tracks(
app: tauri::AppHandle,
request: MixAudioRequest,
) -> ApiResponse<String> {
match ffmpeg_cmd::mix_audio_tracks(
&app,
&request.voice_path,
&request.bgm_path,
&request.output_path,
request.voice_volume,
request.bgm_volume,
).await {
Ok(_) => ApiResponse {
code: 200,
message: "Audio mixed successfully".to_string(),
data: Some(request.output_path),
},
Err(e) => ApiResponse {
code: 500,
message: format!("Failed to mix audio: {}", e),
data: None,
},
}
}
#[derive(Deserialize)]
#[serde(rename_all = "snake_case")]
pub struct StandardizeAudioRequest {
input_path: String,
output_path: String,
}
#[tauri::command]
async fn standardize_audio(
app: tauri::AppHandle,
request: StandardizeAudioRequest,
) -> ApiResponse<String> {
match ffmpeg_cmd::standardize_audio(
&app,
&request.input_path,
&request.output_path,
).await {
Ok(_) => ApiResponse {
code: 200,
message: "Audio standardized successfully".to_string(),
data: Some(request.output_path),
},
Err(e) => ApiResponse {
code: 500,
message: format!("Failed to standardize audio: {}", e),
data: None,
},
}
}
// ============================================================
// 视频合成命令
// ============================================================
+1
View File
@@ -10,6 +10,7 @@ pub mod project;
pub mod avatar;
pub mod auth;
pub mod cache;
pub mod voice;
pub use engine::{
atomic_write_json, atomic_write_bytes,
+8 -8
View File
@@ -8,18 +8,18 @@ use tauri::{AppHandle, Manager};
use crate::storage::engine::{sanitize_id, StorageError};
/// 获取美家卡用户文档目录
/// ~/Documents/Meijiaka/
/// ~/Documents/Meijiaka-zj/
pub fn get_meijiaka_dir() -> Result<PathBuf, StorageError> {
let base = dirs::document_dir()
.ok_or_else(|| StorageError::Io(std::io::Error::new(
std::io::ErrorKind::NotFound,
"无法获取文档目录",
)))?;
Ok(base.join("Meijiaka"))
Ok(base.join("Meijiaka-zj"))
}
/// 获取项目目录路径(不自动创建)
/// ~/Documents/Meijiaka/projects/{project_id}/
/// ~/Documents/Meijiaka-zj/projects/{project_id}/
pub fn get_project_dir_path(project_id: &str) -> Result<PathBuf, StorageError> {
let safe_id = sanitize_id(project_id)?;
let base = get_meijiaka_dir()?;
@@ -27,7 +27,7 @@ pub fn get_project_dir_path(project_id: &str) -> Result<PathBuf, StorageError> {
}
/// 获取项目目录(自动创建)
/// ~/Documents/Meijiaka/projects/{project_id}/
/// ~/Documents/Meijiaka-zj/projects/{project_id}/
pub fn get_project_dir(project_id: &str) -> Result<PathBuf, StorageError> {
let path = get_project_dir_path(project_id)?;
crate::storage::engine::ensure_dir(&path)?;
@@ -44,7 +44,7 @@ pub fn get_project_assets_dir(project_id: &str) -> Result<PathBuf, StorageError>
}
/// 获取项目内的 videos 目录
/// ~/Documents/Meijiaka/projects/{project_id}/videos/
/// ~/Documents/Meijiaka-zj/projects/{project_id}/videos/
pub fn get_project_videos_dir(project_id: &str) -> Result<PathBuf, StorageError> {
let path = get_project_dir(project_id)?;
let videos = path.join("videos");
@@ -53,7 +53,7 @@ pub fn get_project_videos_dir(project_id: &str) -> Result<PathBuf, StorageError>
}
/// 获取项目内的 images 目录
/// ~/Documents/Meijiaka/projects/{project_id}/images/
/// ~/Documents/Meijiaka-zj/projects/{project_id}/images/
pub fn get_project_images_dir(project_id: &str) -> Result<PathBuf, StorageError> {
let path = get_project_dir(project_id)?;
let images = path.join("images");
@@ -62,7 +62,7 @@ pub fn get_project_images_dir(project_id: &str) -> Result<PathBuf, StorageError>
}
/// 获取 products 目录
/// ~/Documents/Meijiaka/products/
/// ~/Documents/Meijiaka-zj/projects/{project_id}/assets/
pub fn get_products_dir() -> Result<PathBuf, StorageError> {
let base = get_meijiaka_dir()?;
let path = base.join("products");
@@ -71,7 +71,7 @@ pub fn get_products_dir() -> Result<PathBuf, StorageError> {
}
/// 获取头像列表 JSON 路径
/// ~/Documents/Meijiaka/avatars.json
/// ~/Documents/Meijiaka-zj/avatars.json
pub fn get_avatars_json_path() -> Result<PathBuf, StorageError> {
let base = get_meijiaka_dir()?;
Ok(base.join("avatars.json"))
+164
View File
@@ -0,0 +1,164 @@
//! Voice 音频文件存储模块
//!
//! 管理 TTS 合成音频和配音文件的本地存储。
use serde::{Deserialize, Serialize};
use std::path::{Path, PathBuf};
use crate::storage::engine::{atomic_write_bytes, atomic_write_json, read_json, ensure_dir, StorageError};
use crate::storage::paths::get_project_dir;
/// 音频文件元数据
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct AudioMeta {
pub id: String,
pub name: String,
pub file_path: String,
pub duration: f64,
pub file_size: u64,
pub voice_id: String,
pub created_at: String,
pub segment_id: Option<String>,
}
/// 音频列表元数据文件
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
#[serde(rename_all = "camelCase")]
pub struct AudioListMeta {
pub project_id: String,
pub audios: Vec<AudioMeta>,
pub updated_at: String,
}
/// 保存音频文件到项目目录
///
/// 将 base64 编码的音频数据写入 `audios/` 子目录。
pub fn save_audio_file(
project_id: &str,
audio_id: &str,
data: &[u8],
name: &str,
voice_id: &str,
duration: f64,
) -> Result<AudioMeta, StorageError> {
let project_dir = get_project_dir(project_id)?;
let audios_dir = project_dir.join("audios");
ensure_dir(&audios_dir)?;
let extension = "mp3";
let file_name = format!("{}.{}", audio_id, extension);
let file_path = audios_dir.join(&file_name);
atomic_write_bytes(&file_path, data)?;
let meta = AudioMeta {
id: audio_id.to_string(),
name: name.to_string(),
file_path: file_path.to_str().unwrap_or_default().to_string(),
duration,
file_size: data.len() as u64,
voice_id: voice_id.to_string(),
created_at: chrono_lite_now(),
segment_id: None,
};
// 更新音频列表元数据
let list = load_audio_list_meta(project_id)?;
let mut list = list.unwrap_or(AudioListMeta {
project_id: project_id.to_string(),
audios: vec![],
updated_at: String::new(),
});
// 替换已存在的同名音频
if let Some(pos) = list.audios.iter().position(|a| a.id == audio_id) {
list.audios[pos] = meta.clone();
} else {
list.audios.push(meta.clone());
}
list.updated_at = chrono_lite_now();
save_audio_list_meta(project_id, &list)?;
Ok(meta)
}
/// 从项目加载音频文件(返回二进制数据)
pub fn load_audio_file(project_id: &str, audio_id: &str) -> Result<Vec<u8>, StorageError> {
let list = load_audio_list_meta(project_id)?
.ok_or_else(|| StorageError::Io(std::io::Error::new(
std::io::ErrorKind::NotFound,
"音频列表不存在",
)))?;
let audio_meta = list.audios.iter()
.find(|a| a.id == audio_id)
.ok_or_else(|| StorageError::Io(std::io::Error::new(
std::io::ErrorKind::NotFound,
format!("音频 {} 不存在", audio_id),
)))?;
let path = Path::new(&audio_meta.file_path);
std::fs::read(path).map_err(Into::into)
}
/// 列出项目所有音频文件
pub fn list_project_audios(project_id: &str) -> Result<Vec<AudioMeta>, StorageError> {
let list = load_audio_list_meta(project_id)?;
Ok(list.map(|l| l.audios).unwrap_or_default())
}
/// 删除项目中的音频文件
pub fn delete_audio_file(project_id: &str, audio_id: &str) -> Result<(), StorageError> {
let list = load_audio_list_meta(project_id)?;
let mut list = list.ok_or_else(|| StorageError::Io(std::io::Error::new(
std::io::ErrorKind::NotFound,
"音频列表不存在",
)))?;
let pos = list.audios.iter().position(|a| a.id == audio_id)
.ok_or_else(|| StorageError::Io(std::io::Error::new(
std::io::ErrorKind::NotFound,
format!("音频 {} 不存在", audio_id),
)))?;
let meta = list.audios.remove(pos);
let path = Path::new(&meta.file_path);
if path.exists() {
std::fs::remove_file(path)?;
}
list.updated_at = chrono_lite_now();
save_audio_list_meta(project_id, &list)
}
/// 获取音频列表元数据文件路径
fn get_audio_list_path(project_id: &str) -> Result<PathBuf, StorageError> {
let project_dir = get_project_dir(project_id)?;
Ok(project_dir.join("audios.json"))
}
fn load_audio_list_meta(project_id: &str) -> Result<Option<AudioListMeta>, StorageError> {
let path = get_audio_list_path(project_id)?;
read_json(&path)
}
fn save_audio_list_meta(project_id: &str, list: &AudioListMeta) -> Result<(), StorageError> {
let path = get_audio_list_path(project_id)?;
atomic_write_json(&path, list)
}
/// 获取项目音频目录路径(供 FFmpeg 使用)
pub fn get_project_audios_dir(project_id: &str) -> Result<PathBuf, StorageError> {
let project_dir = get_project_dir(project_id)?;
let audios_dir = project_dir.join("audios");
ensure_dir(&audios_dir)?;
Ok(audios_dir)
}
// ====================== 工具函数 ======================
fn chrono_lite_now() -> String {
chrono::Utc::now().to_rfc3339()
}
+1 -1
View File
@@ -12,7 +12,7 @@ pub fn uuid_v4() -> String {
/// 如果目录不存在则创建
pub fn get_project_dir(app: &tauri::AppHandle) -> std::path::PathBuf {
let mut path = app.path().document_dir().unwrap_or_else(|_| std::env::temp_dir());
path.push("Meijiaka");
path.push("Meijiaka-zj");
path.push("Projects");
if !path.exists() {
let _ = std::fs::create_dir_all(&path);
+3 -3
View File
@@ -1,8 +1,8 @@
{
"$schema": "https://schema.tauri.app/config/2",
"productName": "美家卡智",
"productName": "美家卡智",
"version": "0.1.0",
"identifier": "cn.meijiaka.ai-video",
"identifier": "cn.meijiaka.ai-jian",
"build": {
"beforeDevCommand": "npm run dev",
"devUrl": "http://localhost:1420",
@@ -13,7 +13,7 @@
"windows": [
{
"label": "main",
"title": "美家卡 智",
"title": "美家卡 智",
"width": 1440,
"height": 960,
"minWidth": 960,
+238
View File
@@ -0,0 +1,238 @@
/**
* Voice API
* ===================
*
* TTS
*/
import { invoke } from '@tauri-apps/api/core';
import { client } from '../client';
// ====================== TTS 类型 ======================
export interface VoiceInfo {
voiceId: string;
name: string;
description: string;
recommended: boolean;
}
export interface TTSSynthesizeRequest {
text: string;
voiceId?: string;
speed?: number;
volume?: number;
pitch?: number;
}
export interface TTSBatchSegment {
text: string;
index: number;
filename?: string;
}
export interface TTSBatchRequest {
segments: TTSBatchSegment[];
voiceId?: string;
speed?: number;
}
export interface TTSResult {
audioBase64: string;
format: string;
text: string;
voiceId: string;
}
export interface TTSBatchResult {
total: number;
successCount: number;
failedCount: number;
results: Array<{
index: number;
text: string;
outputPath?: string;
success: boolean;
error?: string;
}>;
}
// ====================== 声音克隆类型 ======================
export interface VoiceCloneSubmitRequest {
sourceAudioUrl: string;
sourceText?: string;
voiceName?: string;
}
export interface VoiceCloneTaskResponse {
taskId: string;
status: string;
voiceId?: string;
trialUrl?: string;
errorMessage?: string;
}
// ====================== 音频文件管理类型 ======================
export interface AudioMeta {
id: string;
name: string;
filePath: string;
duration: number;
fileSize: number;
voiceId: string;
createdAt: string;
segmentId?: string;
}
// ====================== TTS API ======================
/** 获取预设音色列表 */
export async function getVoiceList(): Promise<VoiceInfo[]> {
const data = await client.get<{ voiceId: string; name: string; description: string; recommended: boolean }[]>('/voice/voices');
return data.map(v => ({
voiceId: v.voiceId,
name: v.name,
description: v.description,
recommended: v.recommended,
}));
}
/** 同步 TTS 合成(返回 base64 */
export async function synthesizeTTS(request: TTSSynthesizeRequest): Promise<TTSResult> {
return client.post<TTSResult>('/voice/synthesize', request);
}
/** 批量 TTS 合成 */
export async function synthesizeBatchTTS(request: TTSBatchRequest): Promise<TTSBatchResult> {
return client.post<TTSBatchResult>('/voice/synthesize-batch', request);
}
// ====================== 声音克隆 API ======================
/** 提交声音克隆任务 */
export async function submitCloneTask(request: VoiceCloneSubmitRequest): Promise<VoiceCloneTaskResponse> {
return client.post<VoiceCloneTaskResponse>('/voice/clone/submit', request);
}
/** 查询克隆任务状态 */
export async function queryCloneTask(taskId: string, blocking = false): Promise<VoiceCloneTaskResponse> {
return client.get<VoiceCloneTaskResponse>(`/voice/clone/query/${taskId}?blocking=${blocking}`);
}
/** 一站式克隆 */
export async function cloneAndWait(request: VoiceCloneSubmitRequest, pollInterval = 5.0): Promise<VoiceCloneTaskResponse> {
return client.post<VoiceCloneTaskResponse>('/voice/clone/clone-and-wait', { ...request, pollInterval });
}
// ====================== 本地音频文件管理(Tauri IPC ======================
/** 保存音频文件到本地 */
export async function saveAudio(args: {
projectId: string;
audioId: string;
audioData: string; // base64 编码
name: string;
voiceId: string;
duration: number;
}): Promise<AudioMeta> {
const result = await invoke<{ code: number; data?: AudioMeta; message: string }>('save_audio', { args });
if (result.code !== 200 || !result.data) {
throw new Error(result.message || '保存音频失败');
}
return result.data;
}
/** 列出项目音频文件 */
export async function listProjectAudios(projectId: string): Promise<AudioMeta[]> {
const result = await invoke<{ code: number; data?: AudioMeta[]; message: string }>('list_project_audios', { projectId });
if (result.code !== 200) {
throw new Error(result.message || '获取音频列表失败');
}
return result.data || [];
}
/** 删除音频文件 */
export async function deleteAudio(projectId: string, audioId: string): Promise<void> {
const result = await invoke<{ code: number; message: string }>('delete_audio', { projectId, audioId });
if (result.code !== 200) {
throw new Error(result.message || '删除音频失败');
}
}
/** 获取项目音频目录 */
export async function getProjectAudiosDir(projectId: string): Promise<string> {
const result = await invoke<{ code: number; data?: string; message: string }>('get_project_audios_dir', { projectId });
if (result.code !== 200 || !result.data) {
throw new Error(result.message || '获取音频目录失败');
}
return result.data;
}
// ====================== 音频处理命令(Tauri IPC ======================
export interface ReplaceAudioRequest {
videoPath: string;
audioPath: string;
outputPath: string;
}
export interface MixAudioRequest {
voicePath: string;
bgmPath: string;
outputPath: string;
voiceVolume: number;
bgmVolume: number;
}
export interface StandardizeAudioRequest {
inputPath: string;
outputPath: string;
}
/** 音频替换:用配音替换视频原音 */
export async function replaceAudioTrack(args: ReplaceAudioRequest): Promise<string> {
const result = await invoke<{ code: number; data?: string; message: string }>('replace_audio_track', {
request: {
video_path: args.videoPath,
audio_path: args.audioPath,
output_path: args.outputPath,
},
});
if (result.code !== 200 || !result.data) {
throw new Error(result.message || '音频替换失败');
}
return result.data;
}
/** 混音:配音 + 背景音乐混合 */
export async function mixAudioTracks(args: MixAudioRequest): Promise<string> {
const result = await invoke<{ code: number; data?: string; message: string }>('mix_audio_tracks', {
request: {
voice_path: args.voicePath,
bgm_path: args.bgmPath,
output_path: args.outputPath,
voice_volume: args.voiceVolume,
bgm_volume: args.bgmVolume,
},
});
if (result.code !== 200 || !result.data) {
throw new Error(result.message || '混音失败');
}
return result.data;
}
/** 音频标准化:转码为 MP3 44.1kHz 192kbps */
export async function standardizeAudio(args: StandardizeAudioRequest): Promise<string> {
const result = await invoke<{ code: number; data?: string; message: string }>('standardize_audio', {
request: {
input_path: args.inputPath,
output_path: args.outputPath,
},
});
if (result.code !== 200 || !result.data) {
throw new Error(result.message || '音频标准化失败');
}
return result.data;
}
+2
View File
@@ -52,6 +52,8 @@ export interface ScriptShot {
videoUrl?: string; // 七牛云视频 URL(用于字幕生成等后续处理)
elementId?: number; // 分镜使用的形象IDKling element_id
voiceId?: string; // 空镜使用的音色ID
voiceVolume?: number; // 人声音量 (0-2, 默认1)
bgmVolume?: number; // 背景音乐音量 (0-2, 默认0.3)
alignmentResult?: AlignmentResult; // 字幕打轴结果
burnedVideoPath?: string; // 压制字幕后的视频路径
burnedAt?: number; // 压制字幕的时间戳
+3 -5
View File
@@ -27,12 +27,10 @@
}
.sidebar-title {
font-size: var(--font-lg);
font-size: 16px;
font-weight: 700;
background: var(--primary-gradient);
background-clip: text;
-webkit-text-fill-color: transparent;
background-clip: text;
color: var(--text-primary);
white-space: nowrap;
}
.sidebar-nav {
+3 -3
View File
@@ -83,11 +83,11 @@ export default function Sidebar({ currentPath, onNavigate }: SidebarProps) {
<div className="sidebar-header">
<div className="sidebar-logo">
<img
src="./assets/logo.png"
alt="美家卡 智"
src="/assets/logo.png"
alt="美家卡 智"
style={{ width: 28, height: 24, objectFit: 'contain' }}
/>
<span className="sidebar-title"> </span>
<span className="sidebar-title"> </span>
</div>
</div>
+3 -3
View File
@@ -80,11 +80,11 @@ export default function Login() {
<div className="login-brand">
<div className="login-logo">
<img
src="./assets/logo.png"
alt="美家卡 智"
src="/assets/logo.png"
alt="美家卡 智"
style={{ width: 42, height: 36, objectFit: 'contain' }}
/>
<span className="login-logo-text"> </span>
<span className="login-logo-text"> </span>
</div>
<p className="login-subtitle">AI </p>
</div>
+1 -1
View File
@@ -8,7 +8,7 @@ export default function AboutUs() {
<div className="card" style={{ padding: 0, overflow: 'hidden' }}>
<div className="settings-row">
<span className="settings-row-label"></span>
<span className="settings-row-value"> </span>
<span className="settings-row-value"> </span>
</div>
<div className="settings-row" style={{ borderTop: '1px solid var(--border-light)' }}>
<span className="settings-row-label"></span>
@@ -0,0 +1,173 @@
/**
* 音频合成页面样式
*/
.audio-mixing {
height: 100%;
display: flex;
flex-direction: column;
gap: var(--spacing-lg);
}
.audio-mixing-header {
padding: var(--spacing-lg);
border-bottom: 1px solid var(--border-light);
}
.audio-mixing-header h2 {
margin: 0 0 var(--spacing-xs);
font-size: var(--font-xl);
font-weight: 600;
}
.audio-mixing-desc {
margin: 0;
color: var(--text-secondary);
font-size: var(--font-sm);
}
.audio-mixing-content {
flex: 1;
display: flex;
gap: var(--spacing-lg);
padding: 0 var(--spacing-lg);
overflow: hidden;
}
/* 左侧分镜列表 */
.audio-mixing-sidebar {
width: 240px;
flex-shrink: 0;
display: flex;
flex-direction: column;
gap: var(--spacing-md);
}
.audio-mixing-sidebar h3 {
font-size: var(--font-base);
font-weight: 600;
margin: 0;
}
.segment-list {
flex: 1;
overflow-y: auto;
display: flex;
flex-direction: column;
gap: var(--spacing-xs);
}
.segment-item {
display: flex;
justify-content: space-between;
align-items: center;
padding: var(--spacing-md);
border-radius: var(--radius-md);
background: var(--bg-secondary);
cursor: pointer;
transition: all var(--transition-fast);
}
.segment-item:hover {
background: var(--bg-hover);
}
.segment-item.active {
background: var(--primary-light);
color: var(--primary);
}
.segment-index {
font-weight: 500;
}
.segment-duration {
font-size: var(--font-sm);
color: var(--text-tertiary);
}
/* 右侧音频设置面板 */
.audio-mixing-panel {
flex: 1;
background: var(--card-bg);
border-radius: var(--radius-lg);
padding: var(--spacing-xl);
overflow-y: auto;
}
.audio-mixing-panel h3 {
margin: 0 0 var(--spacing-xl);
font-size: var(--font-lg);
font-weight: 600;
}
.audio-control {
margin-bottom: var(--spacing-xl);
}
.audio-control label {
display: block;
font-size: var(--font-sm);
font-weight: 500;
color: var(--text-secondary);
margin-bottom: var(--spacing-sm);
}
.slider-container {
display: flex;
align-items: center;
gap: var(--spacing-md);
}
.slider-container input[type="range"] {
flex: 1;
height: 6px;
border-radius: 3px;
background: var(--border-light);
-webkit-appearance: none;
appearance: none;
}
.slider-container input[type="range"]::-webkit-slider-thumb {
-webkit-appearance: none;
appearance: none;
width: 18px;
height: 18px;
border-radius: 50%;
background: var(--primary);
cursor: pointer;
}
.slider-value {
min-width: 50px;
text-align: right;
font-size: var(--font-sm);
font-weight: 500;
color: var(--text-primary);
}
.audio-info {
margin-top: var(--spacing-xl);
padding: var(--spacing-md);
background: var(--bg-secondary);
border-radius: var(--radius-md);
}
.audio-info p {
margin: 0 0 var(--spacing-xs);
font-size: var(--font-sm);
color: var(--text-tertiary);
}
.audio-info p:last-child {
margin-bottom: 0;
}
/* 空状态 */
.empty-state {
display: flex;
align-items: center;
justify-content: center;
height: 200px;
color: var(--text-tertiary);
}
@@ -0,0 +1,115 @@
/**
* (Step 2)
* ====================
*
*
*/
import { useState } from 'react';
import { useProjectStore } from '../../store';
import { toast } from '../../store/uiStore';
import './AudioMixing.css';
export default function AudioMixing() {
const segments = useProjectStore(state => state.segments);
const updateSegment = useProjectStore(state => state.updateSegment);
const [selectedSegmentId, setSelectedSegmentId] = useState<string | null>(
segments.length > 0 ? segments[0].id : null
);
const selectedSegment = segments.find(s => s.id === selectedSegmentId);
const handleVolumeChange = (e: React.ChangeEvent<HTMLInputElement>) => {
if (!selectedSegmentId) return;
const voiceVolume = parseFloat(e.target.value);
updateSegment(selectedSegmentId, { voiceVolume });
toast.success('音量已更新');
};
const handleBgmVolumeChange = (e: React.ChangeEvent<HTMLInputElement>) => {
if (!selectedSegmentId) return;
const bgmVolume = parseFloat(e.target.value);
updateSegment(selectedSegmentId, { bgmVolume });
toast.success('背景音乐音量已更新');
};
return (
<div className="audio-mixing">
<div className="audio-mixing-header">
<h2></h2>
<p className="audio-mixing-desc"></p>
</div>
<div className="audio-mixing-content">
{/* 左侧:分镜列表 */}
<div className="audio-mixing-sidebar">
<h3></h3>
<div className="segment-list">
{segments.map((segment, index) => (
<div
key={segment.id}
className={`segment-item ${segment.id === selectedSegmentId ? 'active' : ''}`}
onClick={() => setSelectedSegmentId(segment.id)}
>
<span className="segment-index"> {index + 1}</span>
<span className="segment-duration">{segment.duration || 0}s</span>
</div>
))}
</div>
</div>
{/* 右侧:音频设置 */}
<div className="audio-mixing-panel">
{selectedSegment ? (
<>
<h3> {segments.findIndex(s => s.id === selectedSegmentId) + 1} </h3>
<div className="audio-control">
<label></label>
<div className="slider-container">
<input
type="range"
min="0"
max="2"
step="0.1"
value={selectedSegment.voiceVolume ?? 1}
onChange={handleVolumeChange}
/>
<span className="slider-value">
{Math.round((selectedSegment.voiceVolume ?? 1) * 100)}%
</span>
</div>
</div>
<div className="audio-control">
<label></label>
<div className="slider-container">
<input
type="range"
min="0"
max="2"
step="0.1"
value={selectedSegment.bgmVolume ?? 0.3}
onChange={handleBgmVolumeChange}
/>
<span className="slider-value">
{Math.round((selectedSegment.bgmVolume ?? 0.3) * 100)}%
</span>
</div>
</div>
<div className="audio-info">
<p>使 Kling AI TTS </p>
<p></p>
</div>
</>
) : (
<div className="empty-state">
<p></p>
</div>
)}
</div>
</div>
</div>
);
}
@@ -5,6 +5,26 @@
* 请使用 Slider 组件或从 Slider.css 导入样式
*/
/* 创作主题分组 */
.topic-category-title {
font-size: var(--font-sm);
color: var(--text-secondary);
margin-bottom: var(--spacing-sm);
}
.topic-groups {
display: grid;
grid-template-columns: repeat(3, 1fr);
gap: var(--spacing-sm);
}
.topic-groups .option-card {
min-width: 0;
padding: var(--spacing-md);
font-size: var(--font-sm);
white-space: nowrap;
}
/* 时长滑块容器 */
.duration-slider-container {
display: flex;
@@ -9,10 +9,18 @@ import ConfirmModal from '../../components/Modal/ConfirmModal';
import './ScriptCreation.css';
import '../../components/Slider/Slider.css';
const scriptTypeOptions = ['对比型', '恐吓型', '干货型', '共情型', '挑战型', '福利型'];
// 创作主题配置
const TOPIC_ITEMS = [
'装修合同避坑',
'装修全流程避坑',
'装修材料避坑',
'装修报价避坑',
'全屋定制避坑',
'装修常见问题',
];
// 时长刻度配置
const DURATION_MARKS = [30, 45, 60, 75];
const DURATION_MARKS = [30, 60, 90, 120];
/**
*
@@ -25,10 +33,7 @@ export default function ScriptCreation() {
const topic = useProjectStore(state => state.topic) || '';
const setTopic = useProjectStore(state => state.setTopic);
const scriptDuration = useProjectStore(state => state.scriptDuration) || 45;
const scriptType = useProjectStore(state => state.scriptType);
const setScriptDuration = useProjectStore(state => state.setScriptDuration);
const setScriptType = useProjectStore(state => state.setScriptType);
const selectedType = scriptType ? Math.max(0, scriptTypeOptions.indexOf(scriptType)) : 0;
const [generating, setGenerating] = useState(false);
const { submit } = useTask();
@@ -89,7 +94,7 @@ export default function ScriptCreation() {
*/
const doGenerate = async () => {
if (!topic.trim()) {
toast.warning('请输入创作灵感');
toast.warning('请选择创作主题');
return;
}
@@ -100,7 +105,6 @@ export default function ScriptCreation() {
{
topic,
duration: scriptDuration,
style: scriptTypeOptions[selectedType],
},
{
callbacks: {
@@ -258,22 +262,21 @@ export default function ScriptCreation() {
<div className="step-layout script-creation">
{/* Left Panel - 参数配置 */}
<div className="step-panel-left">
{/* 创作主题 */}
<div className="panel-section">
<label className="panel-label">
<span style={{ color: 'var(--color-danger)', marginLeft: 4 }}>*</span>
</label>
<span className="panel-sublabel">
B站等
</span>
<textarea
placeholder="
B站等平台的视频链接"
value={topic}
onChange={e => setTopic(e.target.value)}
style={{ resize: 'vertical', height: 200 }}
/>
<label className="panel-label"></label>
<div className="topic-category-title"></div>
<div className="topic-groups">
{TOPIC_ITEMS.map((item) => (
<button
key={item}
className={`option-card ${topic === item ? 'selected' : ''}`}
onClick={() => setTopic(item)}
>
{item}
</button>
))}
</div>
</div>
<div className="panel-section">
@@ -286,12 +289,12 @@ export default function ScriptCreation() {
type="range"
className="slider-input"
min={30}
max={75}
max={120}
step={1}
value={scriptDuration}
onChange={e => setScriptDuration(Number(e.target.value))}
style={
{ '--slider-percent': `${((scriptDuration - 30) / 45) * 100}%` } as React.CSSProperties
{ '--slider-percent': `${((scriptDuration - 30) / 90) * 100}%` } as React.CSSProperties
}
/>
<div className="duration-marks">
@@ -308,22 +311,6 @@ export default function ScriptCreation() {
</div>
</div>
<div className="panel-section">
<label className="panel-label"></label>
<div className="option-cards">
{scriptTypeOptions.map((opt, i) => (
<button
key={i}
className={`option-card ${selectedType === i ? 'selected' : ''}`}
onClick={() => setScriptType(opt)}
title={getScriptTypeDesc(opt)}
>
{opt}
</button>
))}
</div>
</div>
{/* 生成按钮 */}
<button
className="btn btn-primary"
@@ -607,9 +594,9 @@ export default function ScriptCreation() {
</div>
<p className="empty-state-title"></p>
<p className="empty-state-desc">
<br />
"生成脚本"
"生成脚本"
</p>
</div>
)}
@@ -636,18 +623,3 @@ export default function ScriptCreation() {
</div>
);
}
/**
* tooltip
*/
function getScriptTypeDesc(type: string): string {
const descs: { [key: string]: string } = {
: '通过强烈对比突出效果差异,易引发转发',
: '直击痛点引发焦虑,再给出解决方案',
: '输出可落地的装修干货,建立专业度',
: '贴合业主心境引发共鸣,适合情感营销',
: '设定具体挑战目标,自带话题性',
: '以福利为钩子吸引停留,引导留资',
};
return descs[type] || type;
}
@@ -0,0 +1,256 @@
/**
* VideoEditing 样式
* ==================
*/
.video-editing {
width: 100%;
}
.editing-layout {
display: grid;
grid-template-columns: 280px 1fr;
gap: var(--spacing-lg);
margin-top: var(--spacing-md);
}
.editing-sidebar {
display: flex;
flex-direction: column;
gap: var(--spacing-sm);
}
.sidebar-title {
font-size: 13px;
font-weight: 600;
color: var(--text-secondary);
text-transform: uppercase;
letter-spacing: 0.05em;
padding: 0 var(--spacing-xs);
}
.segment-list {
display: flex;
flex-direction: column;
gap: var(--spacing-xs);
}
.segment-clip-item {
background: var(--bg-card);
border: 1px solid var(--border-light);
border-radius: var(--radius-md);
padding: var(--spacing-sm) var(--spacing-md);
cursor: pointer;
transition: all 0.15s ease;
position: relative;
overflow: hidden;
}
.segment-clip-item:hover {
border-color: var(--primary-light);
background: var(--bg-hover);
}
.segment-clip-item.active {
border-color: var(--primary);
background: color-mix(in srgb, var(--primary) 5%, var(--bg-card));
}
.segment-clip-item.processing {
opacity: 0.7;
}
.segment-info {
display: flex;
justify-content: space-between;
align-items: center;
}
.segment-index {
font-size: 13px;
font-weight: 600;
color: var(--text-primary);
}
.segment-duration {
font-size: 12px;
color: var(--text-secondary);
background: var(--bg-light);
padding: 2px 6px;
border-radius: var(--radius-sm);
}
.segment-meta {
margin-top: 4px;
}
.audio-badge {
display: inline-flex;
align-items: center;
gap: 3px;
font-size: 11px;
padding: 2px 6px;
border-radius: var(--radius-sm);
}
.audio-badge.has-audio {
background: color-mix(in srgb, var(--success) 15%, transparent);
color: var(--success);
}
.audio-badge.no-audio {
background: var(--bg-light);
color: var(--text-secondary);
}
.processing-overlay {
position: absolute;
inset: 0;
background: rgba(255, 255, 255, 0.6);
display: flex;
align-items: center;
justify-content: center;
}
.mini-spinner {
width: 16px;
height: 16px;
border: 2px solid var(--border-light);
border-top-color: var(--primary);
border-radius: 50%;
animation: spin 0.6s linear infinite;
}
.editing-panel {
display: flex;
flex-direction: column;
gap: var(--spacing-md);
}
.video-preview-area {
background: var(--bg-card);
border-radius: var(--radius-lg);
overflow: hidden;
aspect-ratio: 9/16;
max-height: 400px;
display: flex;
align-items: center;
justify-content: center;
border: 1px solid var(--border-light);
}
.preview-video {
width: 100%;
height: 100%;
object-fit: contain;
}
.preview-placeholder {
color: var(--text-secondary);
font-size: 14px;
}
.clip-controls {
background: var(--bg-card);
border-radius: var(--radius-lg);
padding: var(--spacing-md);
border: 1px solid var(--border-light);
display: flex;
flex-direction: column;
gap: var(--spacing-md);
}
.control-section h4 {
font-size: 13px;
font-weight: 600;
color: var(--text-primary);
margin-bottom: var(--spacing-sm);
}
.trim-controls {
display: flex;
align-items: center;
gap: var(--spacing-md);
flex-wrap: wrap;
}
.trim-controls label {
display: flex;
align-items: center;
gap: 6px;
font-size: 13px;
color: var(--text-secondary);
}
.trim-input {
width: 60px;
padding: 4px 8px;
border: 1px solid var(--border-light);
border-radius: var(--radius-sm);
font-size: 13px;
text-align: center;
}
.trim-duration {
font-size: 13px;
color: var(--primary);
font-weight: 600;
background: color-mix(in srgb, var(--primary) 8%, transparent);
padding: 4px 10px;
border-radius: var(--radius-sm);
}
.checkbox-label {
display: flex;
align-items: center;
gap: 8px;
font-size: 13px;
cursor: pointer;
color: var(--text-primary);
}
.audio-replace-info {
display: flex;
align-items: center;
gap: 6px;
font-size: 12px;
color: var(--success);
margin-top: 6px;
}
.audio-replace-hint {
font-size: 12px;
color: var(--text-secondary);
margin-top: 6px;
font-style: italic;
}
.segment-voiceover {
background: var(--bg-card);
border-radius: var(--radius-lg);
padding: var(--spacing-md);
border: 1px solid var(--border-light);
}
.segment-voiceover h4 {
font-size: 13px;
font-weight: 600;
color: var(--text-primary);
margin-bottom: var(--spacing-sm);
}
.voiceover-text {
font-size: 13px;
color: var(--text-secondary);
line-height: 1.6;
}
.no-segment {
text-align: center;
padding: var(--spacing-xl);
color: var(--text-secondary);
}
@keyframes spin {
to { transform: rotate(360deg); }
}
@@ -0,0 +1,294 @@
/**
* (Step 3.5)
* ======================
*
* /
*
*/
import { useState, useCallback, useRef } from 'react';
import { invoke } from '@tauri-apps/api/core';
import { useProjectStore } from '../../store';
import { useVoiceStore } from '../../store/voiceStore';
import { getCurrentProjectId } from '../../api/modules/localStorage';
import { replaceAudioTrack } from '../../api/modules/voice';
import { toast } from '../../store/uiStore';
import './VideoEditing.css';
interface SegmentClip {
id: string;
duration: number; // 裁剪后时长(秒)
trimStart: number; // 裁剪起始点(秒)
trimEnd: number; // 裁剪结束点(秒)
muteOriginal: boolean;
replaceAudioId?: string;
videoPath: string;
voiceover: string;
type: 'segment' | 'empty_shot';
}
export default function VideoEditing() {
const segments = useProjectStore(state => state.segments);
const updateSegment = useProjectStore(state => state.updateSegment);
const projectId = getCurrentProjectId();
const { getAudioForSegment } = useVoiceStore();
const [activeId, setActiveId] = useState<string>(segments[0]?.id?.toString() || '');
const [isProcessing, setIsProcessing] = useState(false);
const [processingId, setProcessingId] = useState<string | null>(null);
// 当前分镜
const activeSegment = segments.find(s => s.id?.toString() === activeId);
const audioMeta = activeSegment ? getAudioForSegment(activeSegment.id?.toString() || '') : undefined;
// 裁剪状态
const [trimStart, setTrimStart] = useState(0);
const [trimEnd, setTrimEnd] = useState(
activeSegment?.duration ? parseInt(activeSegment.duration.replace(/[^0-9]/g, '') || '0') : 0
);
const [muteOriginal, setMuteOriginal] = useState(false);
const videoRef = useRef<HTMLVideoElement>(null);
// 分镜切换时重置状态
const handleSegmentSelect = (id: string) => {
setActiveId(id);
const seg = segments.find(s => s.id?.toString() === id);
if (seg) {
const dur = parseInt(seg.duration?.replace(/[^0-9]/g, '') || '0');
setTrimStart(0);
setTrimEnd(dur);
setMuteOriginal(false);
}
};
// 应用裁剪并静音/替换音频
const handleApplyClip = useCallback(async () => {
if (!activeSegment || !activeId || !projectId) return;
setIsProcessing(true);
setProcessingId(activeId);
try {
const segId = activeId;
const clippedDuration = `${trimEnd - trimStart}s`;
// 如果选择了音频替换
if (audioMeta?.filePath && activeSegment.videoPath) {
// 音频替换:用 TTS 配音替换视频原音
// 输出路径基于原视频,添加后缀标记
const outputPath = activeSegment.videoPath.replace('.mp4', `_dubbed_${Date.now()}.mp4`);
await replaceAudioTrack({
videoPath: activeSegment.videoPath,
audioPath: audioMeta.filePath,
outputPath,
});
// 更新分镜数据(视频路径 + 时长)
updateSegment(activeSegment.id!, {
videoPath: outputPath,
duration: clippedDuration,
});
toast.success(`分镜 ${segId} 音频替换完成`);
} else if (muteOriginal && activeSegment.videoPath) {
// 静音标记:记录到分镜元数据,后续合成时处理
updateSegment(activeSegment.id!, {
duration: clippedDuration,
// 静音频标记(需要在视频合成时处理)
});
toast.info('已标记静音,将在最终合成时处理');
} else {
// 仅更新时长
updateSegment(activeSegment.id!, {
duration: clippedDuration,
});
}
toast.success('剪辑参数已保存');
} catch (err) {
console.error('[VideoEditing] 处理失败:', err);
toast.error(`处理失败: ${err instanceof Error ? err.message : String(err)}`);
} finally {
setIsProcessing(false);
setProcessingId(null);
}
}, [activeSegment, activeId, projectId, audioMeta, muteOriginal, trimStart, trimEnd, updateSegment]);
// 视频预览时间更新
const handleTimeUpdate = () => {
if (videoRef.current) {
const current = videoRef.current.currentTime;
// 限制在裁剪范围内
if (current < trimStart) {
videoRef.current.currentTime = trimStart;
} else if (current > trimEnd) {
videoRef.current.currentTime = trimEnd;
videoRef.current.pause();
}
}
};
// 计算总剪辑后时长
const totalClippedDuration = segments.reduce((sum, s) => {
const dur = parseInt(s.duration?.replace(/[^0-9]/g, '') || '0');
return sum + dur;
}, 0);
return (
<div className="video-editing">
<div className="step-header">
<h2></h2>
<p className="step-desc">
{totalClippedDuration}s
</p>
</div>
<div className="editing-layout">
{/* 左侧:分镜列表 */}
<div className="editing-sidebar">
<div className="sidebar-title"></div>
<div className="segment-list">
{segments.map((seg, i) => {
const segId = seg.id?.toString() || String(i);
const audio = getAudioForSegment(segId);
const dur = parseInt(seg.duration?.replace(/[^0-9]/g, '') || '0');
const isActive = segId === activeId;
const isProcessingThis = processingId === segId;
return (
<div
key={segId}
className={`segment-clip-item ${isActive ? 'active' : ''} ${isProcessingThis ? 'processing' : ''}`}
onClick={() => handleSegmentSelect(segId)}
>
<div className="segment-info">
<span className="segment-index"> {i + 1}</span>
<span className="segment-duration">{dur}s</span>
</div>
<div className="segment-meta">
{audio ? (
<span className="audio-badge has-audio">
<svg width="10" height="10" viewBox="0 0 24 24" fill="currentColor">
<path d="M12 3v10.55A4 4 0 1014 17V7h4V3h-6z"/>
</svg>
</span>
) : (
<span className="audio-badge no-audio"></span>
)}
</div>
{isProcessingThis && (
<div className="processing-overlay">
<div className="mini-spinner" />
</div>
)}
</div>
);
})}
</div>
</div>
{/* 右侧:编辑控制区 */}
<div className="editing-panel">
{activeSegment ? (
<>
{/* 视频预览 */}
<div className="video-preview-area">
{activeSegment.videoPath ? (
<video
ref={videoRef}
src={`file://${activeSegment.videoPath}`}
controls
className="preview-video"
onTimeUpdate={handleTimeUpdate}
/>
) : (
<div className="preview-placeholder">
<span></span>
</div>
)}
</div>
{/* 裁剪控制 */}
<div className="clip-controls">
<div className="control-section">
<h4></h4>
<div className="trim-controls">
<label>
<input
type="number"
min={0}
max={trimEnd - 1}
value={trimStart}
onChange={e => setTrimStart(Math.max(0, parseInt(e.target.value) || 0))}
className="trim-input"
/>s
</label>
<label>
<input
type="number"
min={trimStart + 1}
value={trimEnd}
onChange={e => setTrimEnd(parseInt(e.target.value) || trimEnd)}
className="trim-input"
/>s
</label>
<span className="trim-duration">
{Math.max(0, trimEnd - trimStart)}s
</span>
</div>
</div>
{/* 音频替换 */}
<div className="control-section">
<h4></h4>
<label className="checkbox-label">
<input
type="checkbox"
checked={muteOriginal}
onChange={e => setMuteOriginal(e.target.checked)}
/>
</label>
{audioMeta && (
<div className="audio-replace-info">
<svg width="14" height="14" viewBox="0 0 24 24" fill="currentColor">
<path d="M12 3v10.55A4 4 0 1014 17V7h4V3h-6z"/>
</svg>
{audioMeta.name}
</div>
)}
{!audioMeta && (
<div className="audio-replace-hint">
</div>
)}
</div>
<button
className="btn btn-primary"
onClick={handleApplyClip}
disabled={isProcessing || !activeSegment.videoPath}
>
{isProcessing ? '处理中...' : '应用剪辑'}
</button>
</div>
{/* 分镜旁白 */}
<div className="segment-voiceover">
<h4></h4>
<p className="voiceover-text">{activeSegment.voiceover || '暂无旁白'}</p>
</div>
</>
) : (
<div className="no-segment">
<p></p>
</div>
)}
</div>
</div>
</div>
);
}
@@ -0,0 +1,245 @@
/**
* VoiceDubbing 样式
* ==================
*/
.voice-dubbing {
width: 100%;
}
.dubbing-layout {
display: grid;
grid-template-columns: 1fr 1fr;
gap: var(--spacing-lg);
margin-top: var(--spacing-md);
}
.voice-panel,
.mapping-panel {
display: flex;
flex-direction: column;
gap: var(--spacing-md);
}
.panel-section {
background: var(--bg-card);
border: 1px solid var(--border-light);
border-radius: var(--radius-lg);
padding: var(--spacing-md);
}
.panel-section h4 {
font-size: 13px;
font-weight: 600;
color: var(--text-primary);
margin-bottom: var(--spacing-sm);
}
/* 音色网格 */
.voice-grid {
display: grid;
grid-template-columns: 1fr 1fr;
gap: var(--spacing-xs);
}
.voice-card {
border: 1px solid var(--border-light);
border-radius: var(--radius-md);
padding: var(--spacing-sm);
cursor: pointer;
transition: all 0.15s ease;
background: var(--bg-primary);
}
.voice-card:hover {
border-color: var(--primary-light);
background: var(--bg-hover);
}
.voice-card.selected {
border-color: var(--primary);
background: color-mix(in srgb, var(--primary) 5%, var(--bg-card));
}
.voice-name {
font-size: 13px;
font-weight: 600;
color: var(--text-primary);
display: flex;
align-items: center;
gap: 6px;
}
.recommended-tag {
font-size: 10px;
background: color-mix(in srgb, var(--primary) 15%, transparent);
color: var(--primary);
padding: 1px 5px;
border-radius: var(--radius-sm);
font-weight: 500;
}
.voice-desc {
font-size: 11px;
color: var(--text-secondary);
margin-top: 2px;
}
/* 试听 */
.preview-row {
display: flex;
gap: var(--spacing-sm);
align-items: flex-end;
}
.preview-text {
flex: 1;
padding: var(--spacing-sm);
border: 1px solid var(--border-light);
border-radius: var(--radius-md);
font-size: 13px;
resize: none;
line-height: 1.5;
font-family: inherit;
}
.preview-audio {
width: 100%;
height: 36px;
margin-top: var(--spacing-sm);
}
/* 批量合成 */
.batch-info {
display: flex;
gap: var(--spacing-md);
font-size: 12px;
color: var(--text-secondary);
margin-bottom: var(--spacing-sm);
flex-wrap: wrap;
}
.batch-btn {
width: 100%;
}
.progress-bar {
height: 4px;
background: var(--bg-light);
border-radius: 2px;
overflow: hidden;
margin-top: var(--spacing-sm);
}
.progress-fill {
height: 100%;
background: var(--primary);
transition: width 0.3s ease;
}
/* 分镜配音列表 */
.segment-voice-list {
display: flex;
flex-direction: column;
gap: var(--spacing-xs);
max-height: 400px;
overflow-y: auto;
}
.seg-voice-item {
border: 1px solid var(--border-light);
border-radius: var(--radius-md);
padding: var(--spacing-sm);
background: var(--bg-primary);
}
.seg-voice-item.empty-shot {
opacity: 0.5;
}
.seg-voice-info {
display: flex;
flex-direction: column;
gap: 4px;
}
.seg-voice-index {
font-size: 12px;
font-weight: 600;
color: var(--text-primary);
}
.seg-has-audio {
display: flex;
flex-direction: column;
gap: 4px;
}
.audio-name {
font-size: 11px;
color: var(--success);
}
.seg-audio-player {
height: 28px;
width: 100%;
}
.seg-no-audio {
font-size: 11px;
color: var(--text-secondary);
}
.seg-voiceover {
font-size: 11px;
color: var(--text-secondary);
margin-top: 4px;
line-height: 1.4;
overflow: hidden;
text-overflow: ellipsis;
white-space: nowrap;
}
/* 音频文件库 */
.audio-file-list {
display: flex;
flex-direction: column;
gap: var(--spacing-xs);
}
.audio-file-item {
display: flex;
align-items: center;
gap: var(--spacing-sm);
padding: var(--spacing-xs) 0;
border-bottom: 1px solid var(--border-light);
}
.audio-file-item:last-child {
border-bottom: none;
}
.audio-file-info {
flex: 1;
min-width: 0;
}
.audio-file-name {
font-size: 12px;
font-weight: 500;
color: var(--text-primary);
display: block;
overflow: hidden;
text-overflow: ellipsis;
white-space: nowrap;
}
.audio-file-size {
font-size: 11px;
color: var(--text-secondary);
}
.audio-file-player {
height: 28px;
flex-shrink: 0;
}
@@ -0,0 +1,316 @@
/**
*
* =============
*
* TTS
*
*/
import { useState, useEffect, useCallback, useRef } from 'react';
import { useProjectStore } from '../../store';
import { useVoiceStore } from '../../store/voiceStore';
import { getCurrentProjectId } from '../../api/modules/localStorage';
import { synthesizeTTS, synthesizeBatchTTS } from '../../api/modules/voice';
import { saveAudio } from '../../api/modules/voice';
import { toast } from '../../store/uiStore';
import './VoiceDubbing.css';
export default function VoiceDubbing() {
const segments = useProjectStore(state => state.segments);
const updateSegment = useProjectStore(state => state.updateSegment);
const projectId = getCurrentProjectId();
const {
presetVoices,
selectedVoiceId,
loadPresetVoices,
setSelectedVoiceId,
projectAudios,
loadProjectAudios,
getAudioForSegment,
setAudioMapping,
} = useVoiceStore();
const [isSynthesizing, setIsSynthesizing] = useState(false);
const [synthProgress, setSynthProgress] = useState(0);
const [synthTotal, setSynthTotal] = useState(0);
const [customText, setCustomText] = useState('');
const [customPreviewUrl, setCustomPreviewUrl] = useState<string | null>(null);
const audioPreviewRef = useRef<HTMLAudioElement>(null);
// 加载音色和项目音频
useEffect(() => {
loadPresetVoices();
if (projectId) {
loadProjectAudios(projectId);
}
}, [projectId]);
// 获取有旁白文本的分镜(排除空镜)
const voicedSegments = segments.filter(s => s.type !== 'empty_shot' && s.voiceover);
const totalChars = voicedSegments.reduce((sum, s) => sum + (s.voiceover?.length || 0), 0);
// 批量合成所有旁白
const handleBatchSynthesize = useCallback(async () => {
if (!projectId || voicedSegments.length === 0) {
toast.warn('没有需要合成的旁白');
return;
}
setIsSynthesizing(true);
setSynthProgress(0);
setSynthTotal(voicedSegments.length);
let successCount = 0;
let failCount = 0;
try {
for (let i = 0; i < voicedSegments.length; i++) {
const seg = voicedSegments[i];
const segId = seg.id?.toString() || String(i);
const text = seg.voiceover || '';
setSynthProgress(i + 1);
try {
// 同步 TTS 合成(≤200字)
const result = await synthesizeTTS({
text,
voiceId: selectedVoiceId,
speed: 1.0,
});
if (!result.audioBase64) {
throw new Error('未返回音频数据');
}
// 保存到本地
const audioId = `tts_${segId}_${Date.now()}`;
const meta = await saveAudio({
projectId,
audioId,
audioData: result.audioBase64,
name: `旁白-${segId}`,
voiceId: selectedVoiceId,
duration: 0, // 暂时无法获取时长
segmentId: segId,
});
// 关联到分镜
setAudioMapping(segId, meta.id);
// 更新分镜 audioPath
updateSegment(seg.id!, { audioPath: meta.filePath });
successCount++;
} catch (err) {
console.error(`[VoiceDubbing] 分镜 ${segId} 合成失败:`, err);
failCount++;
}
}
if (successCount > 0) {
toast.success(`配音合成完成:成功 ${successCount}${failCount > 0 ? `,失败 ${failCount}` : ''}`);
} else {
toast.error('配音合成全部失败');
}
} finally {
setIsSynthesizing(false);
setSynthProgress(0);
}
}, [projectId, voicedSegments, selectedVoiceId, updateSegment, setAudioMapping]);
// 试听音色
const handlePreviewVoice = useCallback(async () => {
if (!customText.trim()) {
toast.warn('请输入要预览的文本');
return;
}
try {
setCustomPreviewUrl(null);
const result = await synthesizeTTS({
text: customText.slice(0, 200),
voiceId: selectedVoiceId,
speed: 1.0,
});
if (!result.audioBase64) {
throw new Error('未返回音频数据');
}
const audioBlob = new Blob(
[Uint8Array.from(atob(result.audioBase64), c => c.charCodeAt(0))],
{ type: 'audio/mp3' }
);
const url = URL.createObjectURL(audioBlob);
setCustomPreviewUrl(url);
} catch (err) {
toast.error(`试听失败: ${err instanceof Error ? err.message : String(err)}`);
}
}, [customText, selectedVoiceId]);
// 将项目音频关联到分镜
const handleAssignToSegment = (audioId: string, segmentId: string) => {
setAudioMapping(segmentId, audioId);
// 同时更新分镜的 audioPath
const audio = projectAudios.find(a => a.id === audioId);
if (audio) {
updateSegment(parseInt(segmentId), { audioPath: audio.filePath });
}
toast.success('已关联到分镜');
};
const selectedVoice = presetVoices.find(v => v.voiceId === selectedVoiceId);
return (
<div className="voice-dubbing">
<div className="step-header">
<h2></h2>
<p className="step-desc">
{voicedSegments.length} {totalChars}
</p>
</div>
<div className="dubbing-layout">
{/* 左侧:音色选择 + 批量合成 */}
<div className="voice-panel">
{/* 音色选择 */}
<div className="panel-section">
<h4></h4>
<div className="voice-grid">
{presetVoices.map(voice => (
<div
key={voice.voiceId}
className={`voice-card ${voice.voiceId === selectedVoiceId ? 'selected' : ''}`}
onClick={() => setSelectedVoiceId(voice.voiceId)}
>
<div className="voice-name">
{voice.name}
{voice.recommended && <span className="recommended-tag"></span>}
</div>
<div className="voice-desc">{voice.description}</div>
</div>
))}
</div>
</div>
{/* 音色试听 */}
<div className="panel-section">
<h4></h4>
<div className="preview-row">
<textarea
className="preview-text"
value={customText}
onChange={e => setCustomText(e.target.value)}
placeholder="输入文本试听音色(≤200字)..."
rows={3}
maxLength={200}
/>
<button
className="btn btn-secondary"
onClick={handlePreviewVoice}
disabled={!customText.trim()}
>
</button>
</div>
{customPreviewUrl && (
<audio ref={audioPreviewRef} src={customPreviewUrl} controls className="preview-audio" />
)}
</div>
{/* 批量合成 */}
<div className="panel-section">
<h4></h4>
<div className="batch-info">
<span>{selectedVoice?.name}</span>
<span>{voicedSegments.length} </span>
<span> {totalChars} </span>
</div>
<button
className="btn btn-primary batch-btn"
onClick={handleBatchSynthesize}
disabled={isSynthesizing || voicedSegments.length === 0}
>
{isSynthesizing
? `合成中... ${synthProgress}/${synthTotal}`
: `${voicedSegments.length} 个分镜生成配音`}
</button>
{isSynthesizing && (
<div className="progress-bar">
<div
className="progress-fill"
style={{ width: `${(synthProgress / synthTotal) * 100}%` }}
/>
</div>
)}
</div>
</div>
{/* 右侧:分镜-配音映射 */}
<div className="mapping-panel">
<div className="panel-section">
<h4></h4>
<div className="segment-voice-list">
{segments.map((seg, i) => {
const segId = seg.id?.toString() || String(i);
const audio = getAudioForSegment(segId);
const isEmptyShot = seg.type === 'empty_shot';
return (
<div key={segId} className={`seg-voice-item ${isEmptyShot ? 'empty-shot' : ''}`}>
<div className="seg-voice-info">
<span className="seg-voice-index">
{isEmptyShot ? '🎬' : '🎙️'} {i + 1}
</span>
{audio ? (
<div className="seg-has-audio">
<span className="audio-name">{audio.name}</span>
<audio
src={`file://${audio.filePath}`}
controls
className="seg-audio-player"
/>
</div>
) : (
<span className="seg-no-audio">
{isEmptyShot ? '空镜无需配音' : '未配音'}
</span>
)}
</div>
<div className="seg-voiceover">{seg.voiceover || ''}</div>
</div>
);
})}
</div>
</div>
{/* 音频文件列表 */}
{projectAudios.length > 0 && (
<div className="panel-section">
<h4></h4>
<div className="audio-file-list">
{projectAudios.map(audio => (
<div key={audio.id} className="audio-file-item">
<div className="audio-file-info">
<span className="audio-file-name">{audio.name}</span>
<span className="audio-file-size">
{(audio.fileSize / 1024).toFixed(1)} KB
</span>
</div>
<audio
src={`file://${audio.filePath}`}
controls
className="audio-file-player"
/>
</div>
))}
</div>
</div>
)}
</div>
</div>
</div>
);
}
+29 -26
View File
@@ -1,4 +1,5 @@
import ScriptCreation from './ScriptCreation';
import AudioMixing from './AudioMixing';
import VideoGeneration from './VideoGeneration';
import SubtitleBurning from './SubtitleBurning';
import CoverDesign from './CoverDesign';
@@ -11,17 +12,18 @@ import './VideoCreation.css';
const steps = [
{ id: 1, label: '脚本生成' },
{ id: 2, label: '视频生成' },
{ id: 3, label: '字幕压制' },
{ id: 4, label: '封面制作' },
{ id: 5, label: '视频合成' },
{ id: 2, label: '音频合成' },
{ id: 3, label: '视频生成' },
{ id: 4, label: '字幕压制' },
{ id: 5, label: '封面制作' },
{ id: 6, label: '视频合成' },
];
/**
*
*
*
* - 5 -> -> -> ->
* - 6 -> -> -> -> ->
* - 使 projectStore
*/
function VideoCreationContent() {
@@ -44,30 +46,29 @@ function VideoCreationContent() {
// 步骤完成检查
const isStep1Complete = segments.length > 0;
const isStep2Complete = isStep1Complete && segments.every(s => s.videoPath);
// Step 2 音频合成:只需有分镜即可(可以没有音频文件)
const isStep2Complete = isStep1Complete;
// Step 3 视频生成:所有分镜必须有视频路径
const isStep3Complete = isStep2Complete && segments.every(s => s.videoPath);
// Step 4+ 字幕压制:需要视频生成完成
const isStep4PlusComplete = isStep3Complete;
// 判断用户能否进入某步骤
const canAccessStep = (stepId: number) => {
if (stepId === 1) {
return true;
}
if (stepId === 2) {
return isStep1Complete;
}
if (stepId === 3) {
return isStep2Complete;
}
return isStep2Complete;
if (stepId === 1) return true;
if (stepId === 2) return isStep1Complete;
if (stepId === 3) return isStep2Complete;
if (stepId === 4) return isStep3Complete;
if (stepId === 5) return isStep3Complete;
if (stepId === 6) return isStep3Complete;
return isStep3Complete;
};
const canGoNext = () => {
if (currentStep === 1) {
return isStep1Complete;
}
if (currentStep === 2) {
return isStep2Complete;
}
return true;
if (currentStep === 1) return isStep1Complete;
if (currentStep === 2) return isStep2Complete;
if (currentStep === 3) return isStep3Complete;
return isStep3Complete;
};
// 保存当前步骤数据(目前只有 Step 1 需要在切换前落盘)
@@ -163,12 +164,14 @@ function VideoCreationContent() {
case 1:
return <ScriptCreation />;
case 2:
return <VideoGeneration />;
return <AudioMixing />;
case 3:
return <SubtitleBurning />;
return <VideoGeneration />;
case 4:
return <CoverDesign />;
return <SubtitleBurning />;
case 5:
return <CoverDesign />;
case 6:
return <VideoComposite />;
default:
return <ScriptCreation />;
+1 -4
View File
@@ -11,11 +11,8 @@ export { useSettingsStore } from './settingsStore';
export { useProjectStore, useProjectStats, useShots, useShotActions, createNewProject, saveMetaToLocalFile } from './projectStore';
export { useUIStore, toast } from './uiStore';
export { useProgressStore } from './progressStore';
export { useTaskStore } from './taskStore';
export { useVoiceStore, usePresetVoices, useProjectAudios, useAudioMapping } from './voiceStore';
// 类型导出
export type { ProgressPhase, ProgressState } from './progressStore';
export type { LocalTask, TaskType, TaskStatus } from './taskStore';
// 类型导出
export type { UserInfo, AuthState } from './authStore';
+1 -97
View File
@@ -30,7 +30,7 @@ export interface ProjectState {
exportedAt?: number;
createdAt?: number;
updatedAt?: number;
currentStep: number; // 当前视频创作步骤 (1-5)
currentStep: number; // 当前视频创作步骤 (1-6)
scriptDuration?: number; // 脚本生成参数:视频时长
scriptType?: string; // 脚本生成参数:脚本类型
}
@@ -270,103 +270,7 @@ export const useProjectStore = create<ProjectStore>()(
state.updatedAt = Date.now();
}),
// ===== 后端同步操作 =====
saveToBackend: async (projectId?: string) => {
const state = get();
try {
set(s => {
s._isLoading = true;
});
let pid = projectId;
if (!pid) {
const project = await projectApi.create({
title: `项目 ${new Date().toLocaleDateString()}`,
topic: state.segments[0]?.voiceover?.slice(0, 50) || '未命名项目',
scriptType: '干货型',
duration: state.segments.reduce((sum, s) => sum + parseInt(s.duration || '0'), 0),
});
pid = project.id;
}
await projectApi.saveSegments(pid, toBackendSegments(state.segments));
set(s => {
s._isLoading = false;
});
return pid;
} catch (error) {
console.error('[ProjectStore] 保存失败:', error);
set(s => {
s._isLoading = false;
});
throw error;
}
},
loadFromBackend: async (projectId: string) => {
try {
set(s => {
s._isLoading = true;
});
const project = await projectApi.get(projectId);
const segments: ScriptShot[] = (project.segments || []).map(adaptProjectSegment);
set(state => {
state.segments = segments;
state.updatedAt = Date.now();
});
set(s => {
s._isLoading = false;
});
} catch (error) {
console.error('[ProjectStore] 加载失败:', error);
set(s => {
s._isLoading = false;
});
throw error;
}
},
createProject: async (title: string, topic?: string) => {
try {
set(s => {
s._isLoading = true;
});
const project = await projectApi.create({
title,
topic,
status: 'draft',
});
set(state => {
state.segments = [];
state.selectedHumanId = undefined;
state.selectedElementId = undefined;
state.selectedVoiceId = undefined;
state.finalVideoPath = undefined;
state.coverPath = undefined;
state.createdAt = Date.now();
state.updatedAt = Date.now();
});
set(s => {
s._isLoading = false;
});
return project.id;
} catch (error) {
console.error('[ProjectStore] 创建失败:', error);
set(s => {
s._isLoading = false;
});
throw error;
}
},
}),
{
name: 'ai-video-project-config',
+80 -123
View File
@@ -1,25 +1,23 @@
/**
* - Task Store
* ========================
*
* =============
*
*
* - localStorage Redis
* -
* - progressStore
*
* Redis
*/
import { create } from 'zustand';
import { immer } from 'zustand/middleware/immer';
export type TaskType = 'video' | 'image' | 'script' | 'subtitle' | 'copy' | 'avatar_clone';
export type TaskStatus = 'pending' | 'running' | 'completed' | 'failed';
export interface LocalTask {
export interface Task {
id: string;
type: TaskType;
projectId?: string;
status: TaskStatus;
progress: number;
progress: number; // 0-100
message: string;
result?: unknown;
error?: string;
@@ -28,128 +26,87 @@ export interface LocalTask {
}
interface TaskState {
tasks: LocalTask[];
isRestoring: boolean;
tasks: Task[];
// Actions
addTask: (task: Omit<Task, 'createdAt' | 'updatedAt'>) => void;
updateTask: (taskId: string, updates: Partial<Pick<Task, 'status' | 'progress' | 'message'>>) => void;
completeTask: (taskId: string, result?: unknown) => void;
failTask: (taskId: string, error: string) => void;
removeTask: (taskId: string) => void;
clearTasks: (projectId?: string) => void;
// Queries
getTask: (taskId: string) => Task | undefined;
getRunningTasks: () => Task[];
getTasksByProject: (projectId: string) => Task[];
}
interface TaskActions {
// 添加任务
addTask: (task: Omit<LocalTask, 'createdAt' | 'updatedAt'>) => void;
export const useTaskStore = create<TaskState>()(
immer((set, get) => ({
tasks: [],
// 更新任务状态
updateTask: (id: string, updates: Partial<Omit<LocalTask, 'id' | 'type'>>) => void;
addTask: (task) =>
set((state) => {
const now = Date.now();
state.tasks.push({
...task,
createdAt: now,
updatedAt: now,
});
}),
// 完成任务
completeTask: (id: string, result: unknown) => void;
updateTask: (taskId, updates) =>
set((state) => {
const task = state.tasks.find((t) => t.id === taskId);
if (task) {
Object.assign(task, updates, { updatedAt: Date.now() });
}
}),
// 标记失败
failTask: (id: string, error: string) => void;
completeTask: (taskId, result) =>
set((state) => {
const task = state.tasks.find((t) => t.id === taskId);
if (task) {
task.status = 'completed';
task.progress = 100;
task.message = '任务完成';
task.result = result;
task.updatedAt = Date.now();
}
}),
// 删除任务
removeTask: (id: string) => void;
failTask: (taskId, error) =>
set((state) => {
const task = state.tasks.find((t) => t.id === taskId);
if (task) {
task.status = 'failed';
task.error = error;
task.message = '任务失败';
task.updatedAt = Date.now();
}
}),
// 获取进行中的任务
getRunningTasks: () => LocalTask[];
removeTask: (taskId) =>
set((state) => {
state.tasks = state.tasks.filter((t) => t.id !== taskId);
}),
// 获取项目的任务
getProjectTasks: (projectId: string) => LocalTask[];
clearTasks: (projectId) =>
set((state) => {
if (projectId) {
state.tasks = state.tasks.filter((t) => t.projectId !== projectId);
} else {
state.tasks = [];
}
}),
// 设置恢复状态
setRestoring: (value: boolean) => void;
getTask: (taskId) => get().tasks.find((t) => t.id === taskId),
// 清理已完成/失败的任务(保留最近50条)
cleanup: () => void;
}
getRunningTasks: () =>
get().tasks.filter((t) => t.status === 'pending' || t.status === 'running'),
const MAX_STORED_TASKS = 100;
export const useTaskStore = create<TaskState & TaskActions>((set, get) => ({
tasks: [],
isRestoring: false,
addTask: (task) => {
const now = Date.now();
const newTask: LocalTask = {
...task,
createdAt: now,
updatedAt: now,
};
set((state) => ({
tasks: [newTask, ...state.tasks].slice(0, MAX_STORED_TASKS),
}));
},
updateTask: (id, updates) => {
set((state) => ({
tasks: state.tasks.map((t) =>
t.id === id
? { ...t, ...updates, updatedAt: Date.now() }
: t
),
}));
},
completeTask: (id, result) => {
set((state) => ({
tasks: state.tasks.map((t) =>
t.id === id
? {
...t,
status: 'completed',
progress: 100,
result,
updatedAt: Date.now(),
}
: t
),
}));
},
failTask: (id, error) => {
set((state) => ({
tasks: state.tasks.map((t) =>
t.id === id
? {
...t,
status: 'failed',
error,
updatedAt: Date.now(),
}
: t
),
}));
},
removeTask: (id) => {
set((state) => ({
tasks: state.tasks.filter((t) => t.id !== id),
}));
},
getRunningTasks: () => {
return get().tasks.filter(
(t) => t.status === 'running' || t.status === 'pending'
);
},
getProjectTasks: (projectId) => {
return get().tasks.filter((t) => t.projectId === projectId);
},
setRestoring: (value) => {
set({ isRestoring: value });
},
cleanup: () => {
set((state) => {
// 保留进行中的任务,以及最近完成的20条
const running = state.tasks.filter(
(t) => t.status === 'running' || t.status === 'pending'
);
const completed = state.tasks
.filter((t) => t.status === 'completed' || t.status === 'failed')
.slice(0, 20);
return { tasks: [...running, ...completed] };
});
},
}));
getTasksByProject: (projectId) =>
get().tasks.filter((t) => t.projectId === projectId),
}))
);
+219
View File
@@ -0,0 +1,219 @@
/**
* Voice Store - Zustand
* ======================
*
* TTS
*/
import { create } from 'zustand';
import { useShallow } from 'zustand/react/shallow';
import type { VoiceInfo, AudioMeta } from '../api/modules/voice';
import * as voiceApi from '../api/modules/voice';
interface VoiceState {
// 预设音色列表
presetVoices: VoiceInfo[];
selectedVoiceId: string;
// 项目音频文件列表
projectAudios: AudioMeta[];
// 音频替换映射(segmentId → audioId
// 记录每个分镜使用了哪个音频文件
audioMapping: Record<string, string>; // segmentId → audioId
// 当前项目 ID
currentProjectId: string | null;
// 加载状态
isLoadingVoices: boolean;
isLoadingAudios: boolean;
}
interface VoiceActions {
// 音色操作
loadPresetVoices: () => Promise<void>;
setSelectedVoiceId: (id: string) => void;
// 项目音频操作
loadProjectAudios: (projectId: string) => Promise<void>;
saveAudio: (args: {
projectId: string;
audioId: string;
audioData: string;
name: string;
voiceId: string;
duration: number;
segmentId?: string;
}) => Promise<AudioMeta>;
deleteAudio: (projectId: string, audioId: string) => Promise<void>;
// 音频映射操作
setAudioMapping: (segmentId: string, audioId: string | undefined) => void;
getAudioForSegment: (segmentId: string) => AudioMeta | undefined;
// 重置
reset: () => void;
}
const initialState: VoiceState = {
presetVoices: [],
selectedVoiceId: '829826751244537879', // 温柔女声(Kling 预设音色)
projectAudios: [],
audioMapping: {},
currentProjectId: null,
isLoadingVoices: false,
isLoadingAudios: false,
};
export const useVoiceStore = create<VoiceState & VoiceActions>()(
(set, get) => ({
...initialState,
// ====================== 音色操作 ======================
loadPresetVoices: async () => {
set({ isLoadingVoices: true });
try {
const voices = await voiceApi.getVoiceList();
set({ presetVoices: voices });
} catch (err) {
console.error('[VoiceStore] 加载音色列表失败:', err);
// 静默失败,使用默认值(Kling 预设音色)
set({
presetVoices: [
{ voiceId: '829826751244537879', name: '温柔女声', description: '温柔细腻', recommended: true, language: 'zh' },
{ voiceId: '829824295735410756', name: '钓系女友', description: '甜美撒娇', recommended: false, language: 'zh' },
{ voiceId: '829826792415842333', name: '播报男声', description: '沉稳播报', recommended: false, language: 'zh' },
{ voiceId: '829826834144964676', name: '盐系少年', description: '清新少年', recommended: false, language: 'zh' },
{ voiceId: '829826884271091753', name: '撒娇女友', description: '可爱撒娇', recommended: false, language: 'zh' },
],
});
} finally {
set({ isLoadingVoices: false });
}
},
setSelectedVoiceId: (id) => set({ selectedVoiceId: id }),
// ====================== 项目音频操作 ======================
loadProjectAudios: async (projectId) => {
set({ isLoadingAudios: true, currentProjectId: projectId });
try {
const audios = await voiceApi.listProjectAudios(projectId);
set({ projectAudios: audios });
// 从现有音频恢复 audioMapping
const mapping: Record<string, string> = {};
for (const audio of audios) {
if (audio.segmentId) {
mapping[audio.segmentId] = audio.id;
}
}
set({ audioMapping: mapping });
} catch (err) {
console.error('[VoiceStore] 加载项目音频失败:', err);
set({ projectAudios: [] });
} finally {
set({ isLoadingAudios: false });
}
},
saveAudio: async (args) => {
const meta = await voiceApi.saveAudio(args);
const state = get();
// 更新 projectAudios 列表
const existing = state.projectAudios.findIndex(a => a.id === meta.id);
const updatedAudios = existing >= 0
? state.projectAudios.map((a, i) => i === existing ? meta : a)
: [...state.projectAudios, meta];
set({ projectAudios: updatedAudios });
// 如果有 segmentId,更新 audioMapping
if (args.segmentId) {
set(state => ({
audioMapping: { ...state.audioMapping, [args.segmentId!]: meta.id },
}));
}
return meta;
},
deleteAudio: async (projectId, audioId) => {
await voiceApi.deleteAudio(projectId, audioId);
const state = get();
const updatedAudios = state.projectAudios.filter(a => a.id !== audioId);
// 从 audioMapping 中移除
const updatedMapping = { ...state.audioMapping };
for (const [segId, audioId2] of Object.entries(updatedMapping)) {
if (audioId2 === audioId) {
delete updatedMapping[segId];
}
}
set({ projectAudios: updatedAudios, audioMapping: updatedMapping });
},
// ====================== 音频映射操作 ======================
setAudioMapping: (segmentId, audioId) => {
set(state => {
if (audioId === undefined) {
const updated = { ...state.audioMapping };
delete updated[segmentId];
return { audioMapping: updated };
}
return { audioMapping: { ...state.audioMapping, [segmentId]: audioId } };
});
},
getAudioForSegment: (segmentId) => {
const state = get();
const audioId = state.audioMapping[segmentId];
if (!audioId) return undefined;
return state.projectAudios.find(a => a.id === audioId);
},
// ====================== 重置 ======================
reset: () => set(initialState),
})
);
// 便捷 Hooks - 使用 useShallow 进行浅比较,避免不必要的重渲染
export function usePresetVoices() {
return useVoiceStore(
useShallow(state => ({
voices: state.presetVoices,
selectedVoiceId: state.selectedVoiceId,
isLoading: state.isLoadingVoices,
load: state.loadPresetVoices,
setSelected: state.setSelectedVoiceId,
}))
);
}
export function useProjectAudios() {
return useVoiceStore(
useShallow(state => ({
audios: state.projectAudios,
isLoading: state.isLoadingAudios,
load: state.loadProjectAudios,
deleteAudio: state.deleteAudio,
}))
);
}
export function useAudioMapping() {
return useVoiceStore(
useShallow(state => ({
mapping: state.audioMapping,
getAudio: state.getAudioForSegment,
setMapping: state.setAudioMapping,
}))
);
}