refactor: 第三方平台架构改造(Adapter Protocol + Gateway)
Phase 1: 异常体系统一 - 新增 PlatformError / PlatformErrorType 标准定义 - 改造所有 Provider 异常抛出为 PlatformError - 注册全局 PlatformError exception handler Phase 2: Adapter Protocol - 新增 app/ai/adapters/base.py(PlatformAdapter + SyncCapable + TaskCapable + CallbackCapable) - 新增 app/ai/adapters/constants.py(Method 常量) - 新增 PlatformConfigLoader(config/platform-config.yaml) Phase 3: HTTP Client 统一 - ViduProvider 从 aiohttp 迁移到 httpx(注入方式) - VolcengineCaptionService 改为注入 http_client - lifespan 统一管理所有 Client 创建和关闭 Phase 4: Gateway 骨架 + Adapter 实现 - 新增 ViduAdapter / VolcengineArkAdapter / VolcengineCaptionAdapter - 新增 PlatformGateway(call_sync / submit_task / query_task / handle_webhook) - 新增 LLMGateway(带 Fallback 降级链) - lifespan 注册所有 Adapter 和 Gateway Phase 6: 清理与验证 - 从 Settings 移除 VIDU_BASE_URL / VOLCENGINE_BASE_URL - Provider 改为从 PlatformConfigLoader 读取 base_url - 清理 volcengine_caption_service 全局单例 - config_loader 默认路径改为 platform-config.yaml - Scheduler 注入共享 HTTP client - vidu.py 回调路由使用 Adapter 验签和解析 - ruff 全量通过,应用启动测试通过
This commit is contained in:
@@ -0,0 +1,704 @@
|
||||
# 第三方平台接入架构设计方案(最终版)
|
||||
|
||||
> 版本:v1.0 Final
|
||||
> 适用范围:`python-api/` 所有第三方服务接入层
|
||||
> 生效日期:2026-05-02
|
||||
|
||||
---
|
||||
|
||||
## 一、设计目标
|
||||
|
||||
| 目标 | 验收标准 |
|
||||
|------|---------|
|
||||
| 新增平台接入成本 < 30 分钟 | 提供 Adapter 模板,复制粘贴后填充 4 个方法即可 |
|
||||
| 第三方故障不拖垮用户 | 单点故障时,用户 100ms 内收到明确错误,而非超时 30 秒 |
|
||||
| 多用户同时使用无冲突 | 5 个用户同时生成 TTS/脚本时,不触发第三方 429 限流 |
|
||||
| 任务状态可追踪 | 用户关闭应用后重开,能恢复进行中的对口型/字幕任务 |
|
||||
| 未来换平台无感知 | 换 TTS 供应商时,前端接口、存储层、用户历史记录全部无感知 |
|
||||
|
||||
---
|
||||
|
||||
## 二、整体架构
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────┐
|
||||
│ Router(FastAPI) │
|
||||
│ - 校验输入、序列化输出 │
|
||||
│ - 统一错误中间件 │
|
||||
│ - 不处理重试、不限流、不直接调第三方 │
|
||||
│ - 回调入口:/webhooks/{platform} │
|
||||
├──────────────────────────────────────────────┤
|
||||
│ Application Service │
|
||||
│ - ScriptService:编排脚本→TTS→对口型 │
|
||||
│ - VideoService:编排字幕→合成 │
|
||||
│ - 只操作领域对象,不感知平台差异 │
|
||||
├──────────────────────────────────────────────┤
|
||||
│ Gateway Layer │
|
||||
│ ┌─────────────────┐ ┌──────────────────┐ │
|
||||
│ │ LLM Gateway │ │ Task Gateway │ │
|
||||
│ │ - 模型路由 │ │ - 任务状态机 │ │
|
||||
│ │ - Fallback │ │ - 轮询调度 │ │
|
||||
│ │ - 流式代理 │ │ - 回调处理 │ │
|
||||
│ └────────┬────────┘ └────────┬─────────┘ │
|
||||
│ │ │ │
|
||||
│ └────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌────────▼────────┐ │
|
||||
│ │ Shared Infra │ │
|
||||
│ │ - Token Bucket │ │
|
||||
│ │ - CircuitBreaker│ │
|
||||
│ │ - stamina retry │ │
|
||||
│ │ - Structured Log│ │
|
||||
│ └────────┬────────┘ │
|
||||
├────────────────────┼─────────────────────────┤
|
||||
│ Adapter Layer │ │
|
||||
│ - VolcengineArkAdapter │
|
||||
│ - OpenAIAdapter │
|
||||
│ - ViduAdapter │
|
||||
│ - VolcengineCaptionAdapter │
|
||||
│ - MockAdapter │
|
||||
│ 每个 Adapter:Protocol 约定,无状态,可替换 │
|
||||
├──────────────────────────────────────────────┤
|
||||
│ Transport Layer │
|
||||
│ - httpx.AsyncClient(所有 Raw HTTP) │
|
||||
│ - 官方 SDK(仅 LLM 层:AsyncArk/AsyncOpenAI)│
|
||||
│ - lifespan 显式创建、显式关闭 │
|
||||
└────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 三、分层设计
|
||||
|
||||
### 3.1 Adapter 层
|
||||
|
||||
**职责**:纯翻译。把内部标准请求 ↔ 供应商特定请求,内部标准响应 ↔ 供应商特定响应。
|
||||
|
||||
**不职责**:重试、限流、业务逻辑、状态管理。
|
||||
|
||||
**Protocol 约定**:
|
||||
|
||||
```python
|
||||
class LLMAdapter(Protocol):
|
||||
platform_id: str
|
||||
async def chat(self, messages, model, **params) -> AdapterResponse: ...
|
||||
async def chat_stream(self, messages, model, **params): ... # AsyncIterator
|
||||
async def health(self) -> AdapterResponse: ...
|
||||
async def close(self) -> None: ...
|
||||
|
||||
class TaskAdapter(Protocol):
|
||||
platform_id: str
|
||||
async def submit(self, task_type, payload, callback_url) -> AdapterResponse: ...
|
||||
async def query(self, platform_task_id) -> TaskStatus: ...
|
||||
async def parse_callback(self, body) -> TaskStatus: ...
|
||||
async def verify_signature(self, headers, body, secret) -> bool: ...
|
||||
async def extract_nonce(self, headers) -> str | None: ...
|
||||
async def health(self) -> AdapterResponse: ...
|
||||
async def close(self) -> None: ...
|
||||
```
|
||||
|
||||
**AdapterResponse 标准格式**:
|
||||
|
||||
```python
|
||||
@dataclass(frozen=True)
|
||||
class AdapterResponse:
|
||||
success: bool
|
||||
data: dict | None = None
|
||||
error_code: str | None = None
|
||||
error_message: str | None = None
|
||||
retryable: bool = False # Gateway 据此决定是否重试
|
||||
```
|
||||
|
||||
**TaskStatus 标准格式**:
|
||||
|
||||
```python
|
||||
@dataclass(frozen=True)
|
||||
class TaskStatus:
|
||||
task_id: str # 供应商 task_id
|
||||
state: str # "pending" | "processing" | "completed" | "failed"
|
||||
result: dict | None = None
|
||||
error_message: str | None = None
|
||||
```
|
||||
|
||||
**Client 统一**:
|
||||
- 所有 Raw HTTP 用 `httpx.AsyncClient`。
|
||||
- LLM 官方 SDK(AsyncArk、AsyncOpenAI)保留,但 lifespan shutdown 时显式 `close()`。
|
||||
- 每个 Adapter 独立 Client,独立连接池,互不干扰。
|
||||
|
||||
---
|
||||
|
||||
### 3.2 Gateway 层
|
||||
|
||||
#### 3.2.1 LLM Gateway
|
||||
|
||||
```python
|
||||
class LLMGateway:
|
||||
def __init__(self, adapters: dict[str, LLMAdapter], runtime_config: GatewayRuntimeConfig):
|
||||
self.adapters = adapters
|
||||
self.config = runtime_config
|
||||
|
||||
async def chat(self, model_id, messages, **params) -> dict:
|
||||
# 1. 路由到 Adapter
|
||||
# 2. 主模型失败时 Fallback
|
||||
# 3. 流式中途失败不再 Fallback
|
||||
```
|
||||
|
||||
**Fallback 规则**:
|
||||
- 配置驱动,`runtime_config.ark_fallback_chain`
|
||||
- 流式中途失败 → 立即抛异常,不降级(避免内容混合)
|
||||
- 同步调用失败 → 按链降级,对用户透明
|
||||
|
||||
#### 3.2.2 Task Gateway
|
||||
|
||||
```python
|
||||
class TaskGateway:
|
||||
def __init__(self, adapters, storage, runtime_config):
|
||||
self.adapters = adapters
|
||||
self.storage = storage # Redis
|
||||
self.config = runtime_config
|
||||
self.circuit = CircuitBreaker()
|
||||
|
||||
async def submit(self, platform_id, task_type, payload, callback_url=None) -> str:
|
||||
# 1. 限流检查
|
||||
# 2. 熔断检查
|
||||
# 3. Adapter.submit() → 获取 platform_task_id
|
||||
# 4. 生成 internal_task_id (UUID)
|
||||
# 5. Redis 存储映射
|
||||
# 6. 返回 internal_task_id
|
||||
|
||||
async def query(self, internal_task_id) -> TaskStatus:
|
||||
# 1. 查 Redis 映射
|
||||
# 2. 非终态时穿透供应商查询(可选)
|
||||
# 3. 更新 Redis
|
||||
|
||||
async def handle_webhook(self, platform, headers, body, query):
|
||||
# 1. nonce 防重放检查
|
||||
# 2. Adapter.verify_signature()
|
||||
# 3. Adapter.parse_callback()
|
||||
# 4. 更新任务状态
|
||||
```
|
||||
|
||||
**内部 ID 隔离**:
|
||||
|
||||
```python
|
||||
# Redis 存储结构
|
||||
task:{internal_task_id} -> {
|
||||
"platform_id": "vidu",
|
||||
"platform_task_id": "vidu_abc123",
|
||||
"task_type": "lip_sync",
|
||||
"state": "processing",
|
||||
"submitted_at": "2026-05-02T12:00:00Z"
|
||||
}
|
||||
TTL: 3600 秒(1 小时)
|
||||
```
|
||||
|
||||
**轮询调度器**(火山字幕示例):
|
||||
|
||||
```python
|
||||
async def poll_until_complete(self, internal_task_id, max_wait=120):
|
||||
intervals = [0, 1, 2, 4, 8, 8, 10] # 非阻塞阶段
|
||||
|
||||
for interval in intervals:
|
||||
await asyncio.sleep(interval)
|
||||
status = await self.query(internal_task_id)
|
||||
if status.state == "completed":
|
||||
return status
|
||||
if status.state == "failed":
|
||||
raise TaskError(status.error_message)
|
||||
|
||||
# 切换 blocking 阶段
|
||||
while elapsed < max_wait:
|
||||
status = await self._query_with_blocking(internal_task_id)
|
||||
if status.state in ("completed", "failed"):
|
||||
return status
|
||||
|
||||
raise TaskError("任务超时")
|
||||
```
|
||||
|
||||
#### 3.2.3 Shared Infra
|
||||
|
||||
**Token Bucket**(内存级,`aiolimiter`):
|
||||
|
||||
```python
|
||||
vidu_limiter = AsyncLimiter(max_rate=20, time_period=1.0) # 20/s
|
||||
caption_limiter = AsyncLimiter(max_rate=2, time_period=1.0) # 2/s
|
||||
ark_limiter = AsyncLimiter(max_rate=50, time_period=1.0) # 50/s
|
||||
```
|
||||
|
||||
**CircuitBreaker**:
|
||||
|
||||
```python
|
||||
class CircuitBreaker:
|
||||
failure_threshold: int = 5 # 连续失败 5 次熔断
|
||||
recovery_timeout: float = 60.0 # 60 秒后探测恢复
|
||||
```
|
||||
|
||||
**Retry Policy**(`stamina`):
|
||||
|
||||
```python
|
||||
with stamina.retry_context(
|
||||
on=(httpx.NetworkError, httpx.TimeoutException),
|
||||
attempts=3,
|
||||
timeout=30.0,
|
||||
wait_initial=1.0,
|
||||
wait_max=10.0,
|
||||
):
|
||||
await adapter.submit(...)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3.3 Application Service 层
|
||||
|
||||
**职责**:编排业务流程,不感知平台差异。
|
||||
|
||||
```python
|
||||
class ScriptService:
|
||||
async def generate_script(self, category, subcategory, duration):
|
||||
# 调用 LLM Gateway,不关心底层是火山方舟还是 OpenAI
|
||||
result = await llm_gateway.chat(
|
||||
model_id="doubao-seed-2-0-pro",
|
||||
messages=[...],
|
||||
temperature=0.7,
|
||||
)
|
||||
return self._parse_shots(result.data["content"])
|
||||
|
||||
class VideoService:
|
||||
async def submit_lip_sync(self, video_url, audio_url):
|
||||
# 调用 Task Gateway,不关心底层是 Vidu 还是 HeyGen
|
||||
task_id = await task_gateway.submit(
|
||||
platform_id="vidu",
|
||||
task_type="lip_sync",
|
||||
payload={"video_url": video_url, "audio_url": audio_url},
|
||||
callback_url=f"{settings.app_base_url}/webhooks/vidu",
|
||||
)
|
||||
return task_id
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3.4 Router 层
|
||||
|
||||
**职责**:HTTP 语义转换,参数校验,统一返回格式。
|
||||
|
||||
**统一错误中间件**:
|
||||
|
||||
```python
|
||||
@app.exception_handler(PlatformError)
|
||||
async def platform_error_handler(request, exc: PlatformError):
|
||||
status = 502 if exc.retryable else 400
|
||||
return JSONResponse(
|
||||
status_code=status,
|
||||
content={
|
||||
"code": exc.status_code or 500,
|
||||
"message": str(exc),
|
||||
"data": None,
|
||||
"detail": {
|
||||
"platform": exc.platform,
|
||||
"retryable": exc.retryable,
|
||||
} if settings.DEBUG else None,
|
||||
},
|
||||
)
|
||||
```
|
||||
|
||||
**回调入口**:
|
||||
|
||||
```python
|
||||
@router.post("/webhooks/{platform}")
|
||||
async def universal_webhook(platform: str, request: Request):
|
||||
raw_headers = dict(request.headers)
|
||||
raw_body = await request.body()
|
||||
query_params = dict(request.query_params)
|
||||
|
||||
await task_gateway.handle_webhook(
|
||||
platform=platform,
|
||||
headers=raw_headers,
|
||||
body=raw_body,
|
||||
query=query_params,
|
||||
original_path=request.url.path,
|
||||
)
|
||||
return {"received": True}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 四、核心数据流
|
||||
|
||||
### 4.1 TTS 语音合成(同步调用)
|
||||
|
||||
```
|
||||
用户点击"生成配音"
|
||||
↓
|
||||
POST /voice/synthesize
|
||||
↓
|
||||
Router 校验参数
|
||||
↓
|
||||
ViduService.synthesize(text, voice_id...)
|
||||
↓
|
||||
LLM Gateway.call_sync(platform="vidu", method="tts", ...)
|
||||
↓
|
||||
Token Bucket 取令牌(rate=20/s)
|
||||
↓
|
||||
stamina 重试网络错误(最多3次)
|
||||
↓
|
||||
ViduAdapter.call(method="tts", ...)
|
||||
↓
|
||||
httpx.AsyncClient → Vidu API
|
||||
↓
|
||||
返回音频 URL
|
||||
```
|
||||
|
||||
**异常路径**:
|
||||
- 网络错误 → stamina 重试 → 3 次失败后抛 PlatformError(retryable=True) → 502
|
||||
- Vidu 返回 400 → PlatformError(retryable=False) → 400
|
||||
- Vidu 返回 500 → PlatformError(retryable=True) → 502
|
||||
|
||||
### 4.2 脚本生成 SSE(流式调用)
|
||||
|
||||
```
|
||||
用户点击"生成脚本"
|
||||
↓
|
||||
POST /script/generate/stream
|
||||
↓
|
||||
ScriptService.generate_script_stream(...)
|
||||
↓
|
||||
LLM Gateway.chat_stream(model_id="doubao-seed-2-0-pro", ...)
|
||||
↓
|
||||
VolcengineArkAdapter.chat_stream(...)
|
||||
↓
|
||||
SSE 流式输出
|
||||
```
|
||||
|
||||
**关键约束**:流式中途失败**不降级**。已输出内容保持不变,前端收到 error 事件后自行处理。
|
||||
|
||||
### 4.3 对口型任务提交(异步任务)
|
||||
|
||||
```
|
||||
用户点击"生成对口型"
|
||||
↓
|
||||
POST /vidu/lip-sync
|
||||
↓
|
||||
VideoService.submit_lip_sync(...)
|
||||
↓
|
||||
Task Gateway.submit(platform="vidu", task_type="lip_sync", ...)
|
||||
↓
|
||||
Token Bucket 取令牌(rate=5/s)
|
||||
↓
|
||||
CircuitBreaker 检查
|
||||
↓
|
||||
ViduAdapter.submit(method="lip_sync", ...)
|
||||
↓
|
||||
返回 platform_task_id
|
||||
↓
|
||||
生成 internal_task_id (UUID)
|
||||
↓
|
||||
Redis 存储映射
|
||||
↓
|
||||
返回 {task_id: internal_task_id}
|
||||
```
|
||||
|
||||
### 4.4 字幕生成轮询(异步任务 + 轮询)
|
||||
|
||||
```
|
||||
用户点击"生成字幕"
|
||||
↓
|
||||
POST /caption/generate
|
||||
↓
|
||||
CaptionService.generate_caption(...)
|
||||
↓
|
||||
Task Gateway.submit(platform="volcengine_caption", ...)
|
||||
↓
|
||||
返回 internal_task_id
|
||||
↓
|
||||
前端轮询 /tasks/{id}/status
|
||||
↓
|
||||
Gateway.query → Redis 命中 → 返回
|
||||
↓
|
||||
Redis 未命中或 state=processing → 穿透供应商查询 → 更新 Redis
|
||||
```
|
||||
|
||||
**轮询策略**(火山字幕):
|
||||
- 第 1 次:t=0s,blocking=0
|
||||
- 第 2 次:t=1s,blocking=0
|
||||
- 第 3 次:t=3s,blocking=0
|
||||
- 第 4 次起:t=7s, 12s, 17s...,blocking=1
|
||||
|
||||
### 4.5 回调处理(Vid 对口型完成)
|
||||
|
||||
```
|
||||
Vidu 服务器 POST /webhooks/vidu
|
||||
↓
|
||||
Router 提取 headers / body / query
|
||||
↓
|
||||
Task Gateway.handle_webhook(...)
|
||||
↓
|
||||
1. ViduAdapter.extract_nonce(headers) → nonce
|
||||
→ Redis 查 nonce 是否已用
|
||||
→ 已用 → 401
|
||||
2. ViduAdapter.verify_signature(headers, body, secret)
|
||||
→ 失败 → 401
|
||||
3. Redis 标记 nonce 已用(TTL 300s)
|
||||
4. ViduAdapter.parse_callback(body) → TaskStatus
|
||||
5. Redis 更新任务状态
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 五、并发控制
|
||||
|
||||
### 5.1 三层隔离模型
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────┐
|
||||
│ 第一层:任务层(Slot Scheduler) │
|
||||
│ 控制"同时有多少个异步任务在执行" │
|
||||
│ - 火山字幕:max 5 │
|
||||
│ - 对口型:按 Vidu 配额配置 │
|
||||
│ - 脚本生成:max 10 │
|
||||
├─────────────────────────────────────────┤
|
||||
│ 第二层:请求层(Gateway Token Bucket) │
|
||||
│ 控制"每秒向某平台发多少请求" │
|
||||
│ - Vidu TTS:20/s │
|
||||
│ - Vidu 对口型提交:5/s │
|
||||
│ - 火山方舟:50/s │
|
||||
│ - 火山字幕提交:2/s │
|
||||
├─────────────────────────────────────────┤
|
||||
│ 第三层:连接层(HTTP Client Pool) │
|
||||
│ 控制"同时保持多少条 TCP 连接" │
|
||||
│ - Vidu:max 20 │
|
||||
│ - 火山字幕:max 10 │
|
||||
│ - 火山方舟:SDK 内部管理 │
|
||||
└─────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### 5.2 流式连接单独计数
|
||||
|
||||
```python
|
||||
# LLM Gateway 内
|
||||
active_streams: dict[str, int] = {} # {platform: count}
|
||||
|
||||
# 流式上限
|
||||
MAX_STREAMS = {
|
||||
"volcengine_ark": 30,
|
||||
"openai": 30,
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 六、错误处理
|
||||
|
||||
### 6.1 异常类
|
||||
|
||||
```python
|
||||
class PlatformError(Exception):
|
||||
"""第三方平台调用失败"""
|
||||
def __init__(self, message, *, platform: str, retryable: bool = False, status_code: int | None = None):
|
||||
super().__init__(message)
|
||||
self.platform = platform
|
||||
self.retryable = retryable
|
||||
self.status_code = status_code
|
||||
|
||||
class TaskError(Exception):
|
||||
"""任务生命周期错误"""
|
||||
pass
|
||||
|
||||
class LLMError(Exception):
|
||||
"""LLM 调用失败(含 Fallback 耗尽)"""
|
||||
pass
|
||||
```
|
||||
|
||||
### 6.2 HTTP 状态码映射
|
||||
|
||||
| 场景 | PlatformError 属性 | HTTP 状态码 |
|
||||
|------|-------------------|------------|
|
||||
| 网络超时、DNS 失败、5xx | `retryable=True` | 502 Bad Gateway |
|
||||
| 供应商限流 429 | `retryable=True` | 429 Too Many Requests |
|
||||
| 认证失败 401/403 | `retryable=False` | 401 Unauthorized |
|
||||
| 参数错误 400 | `retryable=False` | 400 Bad Request |
|
||||
| 业务逻辑错误(state=failed) | `retryable=False` | 400 Bad Request |
|
||||
|
||||
### 6.3 全局响应格式
|
||||
|
||||
```json
|
||||
{
|
||||
"code": 0,
|
||||
"message": "成功",
|
||||
"data": {}
|
||||
}
|
||||
```
|
||||
|
||||
错误时:
|
||||
|
||||
```json
|
||||
{
|
||||
"code": 500,
|
||||
"message": "Vidu TTS 服务暂不可用",
|
||||
"data": null,
|
||||
"detail": {
|
||||
"platform": "vidu",
|
||||
"retryable": true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 七、配置规范
|
||||
|
||||
### 7.1 嵌套配置模型
|
||||
|
||||
```python
|
||||
class ViduConfig(BaseModel):
|
||||
api_key: str = ""
|
||||
base_url: str = "https://api.vidu.com"
|
||||
max_connections: int = 20
|
||||
timeout: float = 30.0
|
||||
|
||||
class RuntimeConfig(BaseModel):
|
||||
"""运行时配置,支持热重载"""
|
||||
vidu_qps: float = 20.0
|
||||
vidu_burst: int = 30
|
||||
ark_fallback_chain: list[str] = ["doubao-seed-2-0-lite"]
|
||||
caption_poll_intervals: list[float] = [1.0, 1.0, 2.0, 2.0, 4.0, 4.0, 8.0, 8.0, 10.0]
|
||||
circuit_failure_threshold: int = 5
|
||||
circuit_recovery_timeout: float = 60.0
|
||||
|
||||
class Settings(BaseSettings):
|
||||
vidu: ViduConfig = Field(default_factory=ViduConfig)
|
||||
volcengine_ark: VolcengineArkConfig = Field(default_factory=VolcengineArkConfig)
|
||||
volcengine_caption: VolcengineCaptionConfig = Field(default_factory=VolcengineCaptionConfig)
|
||||
openai: OpenAIConfig = Field(default_factory=OpenAIConfig)
|
||||
runtime: RuntimeConfig = Field(default_factory=RuntimeConfig)
|
||||
|
||||
model_config = SettingsConfigDict(
|
||||
env_nested_delimiter="__",
|
||||
)
|
||||
```
|
||||
|
||||
### 7.2 `.env` 示例
|
||||
|
||||
```bash
|
||||
# === 启动配置(改后需重启)===
|
||||
VIDU__API_KEY=sk-xxx
|
||||
VIDU__BASE_URL=https://api.vidu.com
|
||||
VIDU__MAX_CONNECTIONS=20
|
||||
|
||||
VOLCENGINE_ARK__API_KEY=ak-xxx
|
||||
VOLCENGINE_CAPTION__APPID=app-xxx
|
||||
VOLCENGINE_CAPTION__TOKEN=tk-xxx
|
||||
|
||||
OPENAI__API_KEY=sk-xxx
|
||||
|
||||
# === 运行时配置(改后可热载)===
|
||||
RUNTIME__VIDU__QPS=20
|
||||
RUNTIME__VIDU__BURST=30
|
||||
RUNTIME__ARK__FALLBACK_CHAIN=doubao-seed-2-0-lite,doubao-lite-32k
|
||||
RUNTIME__CAPTION__POLL_INTERVALS=1,1,2,2,4,4,8,8,10
|
||||
RUNTIME__CIRCUIT__FAILURE_THRESHOLD=5
|
||||
```
|
||||
|
||||
### 7.3 热重载 API
|
||||
|
||||
```python
|
||||
@router.post("/admin/runtime-config")
|
||||
async def reload_runtime_config(updates: dict):
|
||||
gateway_registry.update_runtime_config(**updates)
|
||||
return {"updated": list(updates.keys())}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 八、日志与可观测性
|
||||
|
||||
### 8.1 结构化日志字段
|
||||
|
||||
```python
|
||||
{
|
||||
"event": "platform_call",
|
||||
"platform": "vidu",
|
||||
"method": "tts_sync",
|
||||
"task_type": "tts",
|
||||
"duration_ms": 1250,
|
||||
"success": true,
|
||||
"http_status": 200,
|
||||
"retry_count": 0,
|
||||
}
|
||||
```
|
||||
|
||||
### 8.2 脱敏规则
|
||||
|
||||
| 级别 | 字段 | 生产环境处理 |
|
||||
|------|------|------------|
|
||||
| P1 | `api_key`, `authorization`, `x-hmac-signature` | `[REDACTED]` |
|
||||
| P2 | `audio_url`, `video_url`, `text` | URL 去签名参数 / 文案截断前 30 字 |
|
||||
| P3 | `platform_task_id`, `internal_task_id` | 前缀保留 8 字符 |
|
||||
| P4 | `duration_ms`, `http_status`, `retry_count` | 完整保留 |
|
||||
|
||||
### 8.3 健康检查端点
|
||||
|
||||
```python
|
||||
@router.get("/system/platform-health")
|
||||
async def platform_health():
|
||||
results = {}
|
||||
for pid, adapter in registry.adapters.items():
|
||||
resp = await adapter.health()
|
||||
results[pid] = {
|
||||
"available": resp.success,
|
||||
"error": resp.error_message,
|
||||
}
|
||||
return results
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 九、迁移策略
|
||||
|
||||
### 9.1 迁移原则
|
||||
|
||||
- **新旧代码并行**:通过 flag 切换,可随时回滚
|
||||
- **逐个平台迁移**:Vidu → 火山字幕 → LLM
|
||||
- **前端无感知**:Router URL、请求体、响应体不变
|
||||
|
||||
### 9.2 Flag 切换机制
|
||||
|
||||
```python
|
||||
# Router 层
|
||||
USE_NEW_VIDU = settings.FEATURE_FLAGS.get("new_vidu_adapter", False)
|
||||
|
||||
@router.post("/voice/synthesize")
|
||||
async def synthesize(request: TTSSynthesizeRequest):
|
||||
if USE_NEW_VIDU:
|
||||
service = get_vidu_service_v2() # 新架构
|
||||
else:
|
||||
service = get_vidu_service() # 旧代码
|
||||
...
|
||||
```
|
||||
|
||||
### 9.3 迁移 Checklist
|
||||
|
||||
| 步骤 | 动作 | 验证 |
|
||||
|------|------|------|
|
||||
| 1 | 新建 `ViduAdapterV2`,实现 Protocol | 单元测试通过 |
|
||||
| 2 | 注册到 Gateway,flag 关闭 | 不影响线上 |
|
||||
| 3 | 测试环境开启 flag,全量回归 | 所有 Vidu 接口正常 |
|
||||
| 4 | 生产灰度 10% → 50% → 100% | 监控 error rate |
|
||||
| 5 | 旧代码保留 1 周后删除 | 无回滚需求 |
|
||||
|
||||
---
|
||||
|
||||
## 十、附录:最终决策清单
|
||||
|
||||
| # | 决策项 | 结论 |
|
||||
|---|--------|------|
|
||||
| 1 | 回调验签位置 | **C**:Router 提取纯数据 → Gateway 调度 → Adapter 验签 |
|
||||
| 2 | 任务结果保留 | 实时反映,Redis 映射 TTL = 1h |
|
||||
| 3 | 七牛云 | 不纳入新架构 |
|
||||
| 4 | SSE 断线 | 不支持续传 |
|
||||
| 5 | MockAdapter | 仅 `DEBUG=True` 时注册 |
|
||||
| 6 | 配置热重载 | **B**:限流参数 + Fallback 链可热载,Adapter 需重启 |
|
||||
| 7 | 日志脱敏 | 四级分级(P1/P2/P3/P4)+ 四档环境 + URL 智能剥离 |
|
||||
| 8 | 火山字幕轮询 | **B**:前 3 次非阻塞(0→1→3s)+ 后切换 `blocking=1` |
|
||||
| 9 | 迁移策略 | **C**:适配器层先行,flag 切换 |
|
||||
| 10 | API Key | 手动维护 |
|
||||
| 11 | 任务状态持久化 | **C**:Redis 开启 AOF 持久化 |
|
||||
@@ -0,0 +1,532 @@
|
||||
# 第三方平台接入架构标准化实施计划
|
||||
|
||||
> 版本:v1.0
|
||||
> 依据:third-party-integration-architecture.md(架构设计最终版)
|
||||
> 目标:统一异常体系、Adapter 契约、HTTP Client 生命周期、异步任务状态机、配置分层
|
||||
> 生效日期:2026-05-03
|
||||
|
||||
---
|
||||
|
||||
## 一、总则
|
||||
|
||||
### 1.1 实施原则
|
||||
|
||||
- **标准优先**:以行业主流做法为准,不因"改动小"而妥协
|
||||
- **文档先行**:所有变更必须在此文档中登记,实施完成后逐项核对
|
||||
- **标准优先**:存量代码直接按标准改造,不做兼容包装层
|
||||
- **渐进验证**:每阶段完成后运行测试,确认无回归再进入下一阶段
|
||||
|
||||
### 1.2 不适用本标准的例外
|
||||
|
||||
- 七牛云存储(纯上传下载,不纳入 Adapter 体系)
|
||||
|
||||
---
|
||||
|
||||
## 二、五条铁律规范(实施标准)
|
||||
|
||||
### 铁律 1:异常出口唯一
|
||||
|
||||
**规范内容**:所有 `app/services/`、`app/ai/` 下的代码,对外抛出的异常必须是 `PlatformError`。Router 只 `except PlatformError` 和 `AppException`(业务错误)。
|
||||
|
||||
**具体标准**:
|
||||
|
||||
```python
|
||||
# app/core/exceptions.py —— 唯一的第三方异常类
|
||||
class PlatformErrorType:
|
||||
RATE_LIMIT = "rate_limit"
|
||||
AUTH_FAILED = "auth_failed"
|
||||
TIMEOUT = "timeout"
|
||||
SERVER_ERROR = "server_error"
|
||||
BAD_REQUEST = "bad_request"
|
||||
QUOTA_EXHAUSTED = "quota_exhausted"
|
||||
NOT_FOUND = "not_found"
|
||||
UNKNOWN = "unknown"
|
||||
|
||||
class PlatformError(Exception):
|
||||
def __init__(
|
||||
self,
|
||||
message: str,
|
||||
*,
|
||||
platform: str,
|
||||
retryable: bool = False,
|
||||
error_type: str = PlatformErrorType.UNKNOWN,
|
||||
status_code: int | None = None,
|
||||
):
|
||||
super().__init__(message)
|
||||
self.platform = platform
|
||||
self.retryable = retryable
|
||||
self.error_type = error_type
|
||||
self.status_code = status_code
|
||||
```
|
||||
|
||||
**HTTP 状态码映射**(全局中间件):
|
||||
|
||||
| error_type | retryable | HTTP 状态码 |
|
||||
|-----------|-----------|------------|
|
||||
| rate_limit | True | 429 |
|
||||
| timeout | True | 504 |
|
||||
| server_error | True | 502 |
|
||||
| auth_failed | False | 401 |
|
||||
| bad_request | False | 400 |
|
||||
| quota_exhausted | False | 429(带 Retry-After) |
|
||||
| unknown | False | 400 |
|
||||
|
||||
**禁止事项**:
|
||||
- [ ] `app/services/` 和 `app/ai/` 中禁止 `raise HTTPException`
|
||||
- [ ] `app/services/` 和 `app/ai/` 中禁止裸 `raise Exception(...)`
|
||||
- [ ] 各 Router 中禁止 `except Exception: raise HTTPException(500, ...)` 处理第三方错误
|
||||
|
||||
---
|
||||
|
||||
### 铁律 2:Adapter 最小契约
|
||||
|
||||
**规范内容**:每个新平台必须实现 `PlatformAdapter`(`platform_id` + `health()` + `close()`)。按需实现 `SyncCapable`(同步调用)或 `TaskCapable`(异步任务)。
|
||||
|
||||
**Protocol 定义**:
|
||||
|
||||
```python
|
||||
# app/ai/adapters/base.py
|
||||
|
||||
@runtime_checkable
|
||||
class PlatformAdapter(Protocol):
|
||||
"""所有 Adapter 的准入门槛"""
|
||||
platform_id: str
|
||||
async def health(self) -> AdapterResponse: ...
|
||||
async def close(self) -> None: ...
|
||||
|
||||
@runtime_checkable
|
||||
class SyncCapable(Protocol):
|
||||
"""同步调用能力(TTS、Chat、图片生成等)"""
|
||||
async def call(self, method: str, payload: dict) -> AdapterResponse: ...
|
||||
|
||||
@runtime_checkable
|
||||
class TaskCapable(Protocol):
|
||||
"""异步任务能力(对口型、字幕、视频生成等)"""
|
||||
async def submit(self, task_type: str, payload: dict, callback_url: str | None) -> AdapterResponse: ...
|
||||
async def query(self, platform_job_id: str) -> TaskStatus: ...
|
||||
|
||||
@runtime_checkable
|
||||
class CallbackCapable(Protocol):
|
||||
"""回调验签能力(可选)"""
|
||||
async def verify_signature(self, headers: dict, body: bytes, secret: str) -> bool: ...
|
||||
async def parse_callback(self, body: bytes) -> TaskStatus: ...
|
||||
```
|
||||
|
||||
**统一返回值**:
|
||||
|
||||
```python
|
||||
@dataclass(frozen=True)
|
||||
class AdapterResponse:
|
||||
success: bool
|
||||
data: dict | None = None
|
||||
error_code: str | None = None
|
||||
error_message: str | None = None
|
||||
retryable: bool = False
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class TaskStatus:
|
||||
state: str # "pending" | "processing" | "completed" | "failed"
|
||||
result: dict | None = None
|
||||
error_message: str | None = None
|
||||
```
|
||||
|
||||
**方法标识常量**:
|
||||
|
||||
```python
|
||||
# app/ai/adapters/constants.py
|
||||
class Method:
|
||||
TTS = "tts"
|
||||
CHAT = "chat"
|
||||
CHAT_STREAM = "chat_stream"
|
||||
IMAGE_GENERATE = "image_generate"
|
||||
LIP_SYNC = "lip_sync"
|
||||
CAPTION = "caption"
|
||||
AUTO_ALIGN = "auto_align"
|
||||
```
|
||||
|
||||
**各平台对号入座**:
|
||||
|
||||
| 平台 | 必须实现 | 当前状态 |
|
||||
|-----|---------|---------|
|
||||
| 火山方舟 | `PlatformAdapter + SyncCapable` | ❌ 缺失,需新建 `VolcengineArkAdapter` |
|
||||
| Vidu | `PlatformAdapter + SyncCapable + TaskCapable + CallbackCapable` | ❌ 缺失,需新建 `ViduAdapter` |
|
||||
| 火山字幕 | `PlatformAdapter + TaskCapable` | ❌ 缺失,需新建 `VolcengineCaptionAdapter` |
|
||||
|
||||
**禁止事项**:
|
||||
- [ ] 新增平台不实现 `PlatformAdapter` 直接接入
|
||||
- [ ] Adapter 内部抛出的异常不是 `PlatformError`
|
||||
- [ ] `call()` 方法返回裸 `dict` 而不是 `AdapterResponse`
|
||||
|
||||
---
|
||||
|
||||
### 铁律 3:HTTP Client 统一关闭
|
||||
|
||||
**规范内容**:所有对外 HTTP 连接(`httpx.AsyncClient` 或 SDK 内部)必须在 `lifespan` 中创建和销毁。禁止在方法内临时创建 `httpx.AsyncClient()`。
|
||||
|
||||
**具体标准**:
|
||||
|
||||
```python
|
||||
# lifespan 中的标准写法
|
||||
@asynccontextmanager
|
||||
async def lifespan(app: FastAPI):
|
||||
# 各平台独立 Client(故障隔离)
|
||||
app.state.http_clients = {
|
||||
"vidu": httpx.AsyncClient(timeout=30, limits=httpx.Limits(max_connections=20)),
|
||||
"volcengine_caption": httpx.AsyncClient(timeout=60, limits=httpx.Limits(max_connections=10)),
|
||||
"default": httpx.AsyncClient(timeout=30, limits=httpx.Limits(max_connections=50)),
|
||||
}
|
||||
|
||||
# SDK 客户端
|
||||
app.state.ark_client = AsyncArk(api_key=..., timeout=1800)
|
||||
|
||||
yield
|
||||
|
||||
# 统一关闭
|
||||
for client in app.state.http_clients.values():
|
||||
await client.aclose()
|
||||
|
||||
if hasattr(app.state, "ark_client") and not app.state.ark_client.is_closed():
|
||||
await app.state.ark_client.close()
|
||||
```
|
||||
|
||||
**迁移清单**:
|
||||
|
||||
| 文件 | 当前 Client | 整改方式 |
|
||||
|-----|------------|---------|
|
||||
| `app/ai/providers/vidu_provider.py` | `aiohttp.ClientSession` | 迁移为 `httpx.AsyncClient`,由 lifespan 注入 |
|
||||
| `app/services/volcengine_caption_service.py` | `httpx.AsyncClient`(懒加载,永不关闭) | 改为 lifespan 注入,删除 `_get_client()` 懒加载 |
|
||||
| `app/api/v1/voice.py` | 临时 `httpx.AsyncClient()` | 改为 `app.state.http_clients["default"]` |
|
||||
|
||||
**禁止事项**:
|
||||
- [ ] 禁止 import `aiohttp`(项目统一使用 `httpx`)
|
||||
- [ ] 禁止在方法/路由内 `httpx.AsyncClient()` 临时创建(下载大文件例外,需注释说明)
|
||||
- [ ] 禁止 Provider `__init__` 中创建 Client 而不在 lifespan 中关闭
|
||||
|
||||
---
|
||||
|
||||
### 铁律 4:异步任务状态唯一
|
||||
|
||||
**规范内容**:所有"提交后等待"的任务,状态必须写入统一的状态机。第三方回调走统一入口 `/webhooks/{platform}`。
|
||||
|
||||
> **注意**:本铁律涉及数据库设计,当前文档中 SQLAlchemy `Job` model 相关设计已挂起,待数据库方案确定后补充。本章只规定接口和状态机标准。
|
||||
|
||||
**统一状态枚举**:
|
||||
|
||||
```python
|
||||
class JobStatus(str, Enum):
|
||||
PENDING = "pending" # 已提交,等待调度
|
||||
QUEUED = "queued" # 已进入队列,等待槽位
|
||||
RUNNING = "running" # 正在执行
|
||||
SUCCEEDED = "succeeded" # 成功完成
|
||||
FAILED = "failed" # 失败
|
||||
CANCELLED = "cancelled" # 用户取消或超时取消
|
||||
```
|
||||
|
||||
**统一任务 API**:
|
||||
|
||||
```python
|
||||
# 提交任务
|
||||
POST /jobs
|
||||
Request: {platform: str, task_type: str, payload: dict, idempotency_key: str | None}
|
||||
Response: {job_id: UUID, status: "pending"}
|
||||
|
||||
# 查询任务
|
||||
GET /jobs/{job_id}
|
||||
Response: {job_id, status, progress, message, result, error, created_at, updated_at}
|
||||
|
||||
# 统一回调入口
|
||||
POST /webhooks/{platform}
|
||||
```
|
||||
|
||||
**第三方状态映射**(Adapter 层负责):
|
||||
|
||||
| 第三方状态 | 内部状态 |
|
||||
|-----------|---------|
|
||||
| Vidu: `pending` / `processing` | `RUNNING` |
|
||||
| Vidu: `success` | `SUCCEEDED` |
|
||||
| Vidu: `failed` | `FAILED` |
|
||||
| 火山字幕: `code=2000` | `RUNNING` |
|
||||
| 火山字幕: `code=0` | `SUCCEEDED` |
|
||||
| 火山字幕: `code=1001/1002/1012` | `FAILED`(不可重试) |
|
||||
| 火山字幕: `code=1003`(超频) | `FAILED`(可重试) |
|
||||
|
||||
**禁止事项**:
|
||||
- [ ] 禁止 Router/Service 私设 Redis key(如 `vidu:lipsync:xxx`)
|
||||
- [ ] 禁止在 Router 中直接处理回调验签(必须由 Adapter 处理)
|
||||
- [ ] 禁止各平台使用自己的状态字符串返回给前端
|
||||
|
||||
---
|
||||
|
||||
### 铁律 5:配置与密钥分离
|
||||
|
||||
**规范内容**:非敏感配置走 `config/platform-config.yaml`,支持热重载。密钥走 `.env`,修改需重启。
|
||||
|
||||
**文件结构**:
|
||||
|
||||
```yaml
|
||||
# config/platform-config.yaml
|
||||
platforms:
|
||||
<platform_id>:
|
||||
provider: <provider_type>
|
||||
base_url: <url>
|
||||
models: # 原 ai_models.yaml 内容合并至此
|
||||
- id: <model_alias>
|
||||
model_name: <实际模型ID>
|
||||
capabilities: [<capability>]
|
||||
default_params: <dict>
|
||||
rate_limit:
|
||||
qps: <float>
|
||||
burst: <int>
|
||||
methods:
|
||||
<method>:
|
||||
timeout: <int>
|
||||
max_connections: <int>
|
||||
rate_limit:
|
||||
qps: <float>
|
||||
burst: <int>
|
||||
|
||||
runtime:
|
||||
fallback_chains:
|
||||
<capability>:
|
||||
- <model_alias_1>
|
||||
- <model_alias_2>
|
||||
task_timeouts:
|
||||
<task_type>: <seconds>
|
||||
task_ttl:
|
||||
<task_type>: <seconds>
|
||||
```
|
||||
|
||||
**热重载实现**:
|
||||
|
||||
```python
|
||||
class RuntimeConfig:
|
||||
"""运行时配置,轮询检查 mtime(10秒间隔)+ Admin API 手动触发"""
|
||||
|
||||
async def get(self, key: str, default=None):
|
||||
await self._reload_if_changed()
|
||||
return self._config.get(key, default)
|
||||
|
||||
async def force_reload(self) -> bool:
|
||||
"""Admin API 调用"""
|
||||
...
|
||||
```
|
||||
|
||||
**Admin API**:
|
||||
|
||||
```python
|
||||
@router.post("/admin/runtime-config/reload")
|
||||
async def reload_runtime_config():
|
||||
success = await runtime_config.force_reload()
|
||||
return {"reloaded": success, "version": runtime_config.version}
|
||||
|
||||
@router.get("/admin/runtime-config")
|
||||
async def get_runtime_config():
|
||||
return runtime_config.get_raw()
|
||||
```
|
||||
|
||||
**迁移清单**:
|
||||
|
||||
| `.env` 中的配置项 | 迁移目标 | 状态 |
|
||||
|------------------|---------|------|
|
||||
| `VIDU_BASE_URL` | `platforms.vidu.base_url` | 待迁移 |
|
||||
| `VOLCENGINE_BASE_URL` | `platforms.volcengine_ark.base_url` | 待迁移 |
|
||||
| `VOLC_SUBTITLE_MAX_CONCURRENT` | `platforms.volcengine_caption.methods.caption.max_connections` | 待迁移 |
|
||||
| `VOLC_SUBTITLE_TIMEOUT` | `runtime.task_timeouts.caption` | 待迁移 |
|
||||
|
||||
**禁止事项**:
|
||||
- [ ] 禁止在 `.env` 中存放非敏感配置(URL、超时、限流)
|
||||
- [ ] 禁止代码中硬编码配置(如 `timeout=30`、`max_rate=20`)
|
||||
- [ ] 禁止 `Settings` 类超过 150 行(逐步瘦身)
|
||||
|
||||
---
|
||||
|
||||
## 三、分阶段实施计划
|
||||
|
||||
### Phase 0:准备(0.5 天)
|
||||
|
||||
| # | 任务 | 输出文件 | 检查方式 |
|
||||
|---|------|---------|---------|
|
||||
| 0.1 | 新建 `app/ai/adapters/` 目录结构 | `app/ai/adapters/__init__.py` | 目录存在 |
|
||||
| 0.2 | 新建 `app/platform_gateway.py` 骨架 | `app/platform_gateway.py` | 文件存在,类定义完整 |
|
||||
| 0.3 | 安装/确认 `importlinter` 可用 | `pyproject.toml` 依赖 | `pip show importlinter` |
|
||||
| 0.4 | 备份现有 `exceptions.py` | git stash / 分支 | 可回滚 |
|
||||
|
||||
### Phase 1:异常体系(0.5 天)
|
||||
|
||||
| # | 任务 | 输出文件 | 检查方式 |
|
||||
|---|------|---------|---------|
|
||||
| 1.1 | 重构 `PlatformError` + `PlatformErrorType` | `app/core/exceptions.py` | 类型定义完整,含所有字段 |
|
||||
| 1.2 | 保留 `AppException` 体系(业务错误) | `app/core/exceptions.py` | 原有类不删除 |
|
||||
| 1.3 | `main.py` 注册 `PlatformError` 全局中间件 | `app/main.py` | 启动无报错,异常测试返回正确 HTTP 码 |
|
||||
| 1.4 | `VolcengineArkAdapter._wrap_error()` 实现异常映射 | `app/ai/adapters/volcengine_ark.py` | 单元测试覆盖 |
|
||||
| 1.5 | `ViduAdapter._wrap_error()` 实现异常映射 | `app/ai/adapters/vidu.py` | 单元测试覆盖 |
|
||||
| 1.6 | `make lint-semantic` 增加异常规则 | `Makefile` | 提交时自动检查 |
|
||||
|
||||
**验收标准**:
|
||||
- [ ] 任意第三方调用失败,Router 返回的 JSON 中 `detail.retryable` 正确
|
||||
- [ ] 网络超时返回 504,限流返回 429,认证失败返回 401
|
||||
- [ ] 业务错误(如参数校验失败)仍走 `AppException` → 400/422
|
||||
|
||||
### Phase 2:Adapter Protocol + 配置合并(1 天)
|
||||
|
||||
| # | 任务 | 输出文件 | 检查方式 |
|
||||
|---|------|---------|---------|
|
||||
| 2.1 | `PlatformAdapter` / `SyncCapable` / `TaskCapable` / `CallbackCapable` Protocol | `app/ai/adapters/base.py` | `isinstance` 校验通过 |
|
||||
| 2.2 | `AdapterResponse` / `TaskStatus` dataclass | `app/ai/adapters/base.py` | frozen=True,字段完整 |
|
||||
| 2.3 | `Method` 常量定义 | `app/ai/adapters/constants.py` | 覆盖所有现有方法 |
|
||||
| 2.4 | 合并 `ai_models.yaml` → `platform-config.yaml` | `config/platform-config.yaml` | 原有模型列表完整迁移 |
|
||||
| 2.5 | `RuntimeConfig` 热重载实现 | `app/core/runtime_config.py` | mtime 轮询 + force_reload 均工作 |
|
||||
| 2.6 | Admin API `/admin/runtime-config/*` | `app/api/v1/system.py` | GET/POST 返回正确 |
|
||||
| 2.7 | `Settings` 类清理非敏感配置 | `app/config.py` | 只保留密钥,行数 < 150 |
|
||||
|
||||
**验收标准**:
|
||||
- [ ] 新增一个 MockAdapter 实现 Protocol,IDE 自动提示缺失方法
|
||||
- [ ] 修改 `runtime.yaml` 中的 qps,10 秒内新请求生效
|
||||
- [ ] Admin API 手动触发 reload,返回最新配置
|
||||
|
||||
### Phase 3:HTTP Client 统一(1 天)
|
||||
|
||||
| # | 任务 | 输出文件 | 检查方式 |
|
||||
|---|------|---------|---------|
|
||||
| 3.1 | `ViduProvider` 从 `aiohttp` 迁移到 `httpx` | `app/ai/providers/vidu_provider.py` | 功能测试通过 |
|
||||
| 3.2 | `VolcengineCaptionService` 删除懒加载,改为注入 Client | `app/services/volcengine_caption_service.py` | 功能测试通过 |
|
||||
| 3.3 | `voice.py` 中临时 `httpx.AsyncClient()` 改为共享 Client | `app/api/v1/voice.py` | 代码审查 |
|
||||
| 3.4 | lifespan 统一管理所有 Client 生命周期 | `app/main.py` | 启动/关闭无泄漏日志 |
|
||||
| 3.5 | `ViduAdapter.close()` / `VolcengineCaptionAdapter.close()` 实现 | 对应 Adapter 文件 | lifespan shutdown 时调用 |
|
||||
| 3.6 | `make lint` 增加 `aiohttp` import 禁止规则 | `pyproject.toml` 或 pre-commit | import aiohttp 报 error |
|
||||
|
||||
**验收标准**:
|
||||
- [ ] `pip list | grep aiohttp` 无输出(或确认仅作为间接依赖)
|
||||
- [ ] `python -m app.main` 启动后,关闭时无 `unclosed client session` 警告
|
||||
- [ ] 所有 `AsyncClient` 创建都在 lifespan 中
|
||||
|
||||
### Phase 4:Gateway 骨架 + Adapter 包装层(1 天)
|
||||
|
||||
| # | 任务 | 输出文件 | 检查方式 |
|
||||
|---|------|---------|---------|
|
||||
| 4.1 | `PlatformGateway` 骨架(`call_sync` / `submit_task` / `query_task` / `handle_webhook`) | `app/platform_gateway.py` | 类方法签名完整 |
|
||||
| 4.2 | `VolcengineArkAdapter` 改造现有 Provider 实现 Protocol | `app/ai/adapters/volcengine_ark.py` | 单元测试通过 |
|
||||
| 4.3 | `ViduAdapter` 改造现有 Provider 实现 Protocol | `app/ai/adapters/vidu.py` | 单元测试通过 |
|
||||
| 4.4 | `VolcengineCaptionAdapter` 改造现有 Service 实现 Protocol | `app/ai/adapters/volcengine_caption.py` | 单元测试通过 |
|
||||
| 4.5 | `LLMGateway` 实现(模型选择、Fallback、流式路由) | `app/ai/gateways/llm_gateway.py` | 脚本生成功能测试通过 |
|
||||
| 4.6 | lifespan 中初始化所有 Adapter 并注册到 Gateway | `app/main.py` | 启动日志显示各平台初始化成功 |
|
||||
|
||||
**验收标准**:
|
||||
- [ ] 新增一个 `MockAdapter` 实现 Protocol,5 分钟内完成注册并可用
|
||||
- [ ] `LLMGateway.chat()` 主模型失败时自动 Fallback 到备用模型
|
||||
- [ ] 健康检查 `/system/platform-health` 返回所有平台状态
|
||||
|
||||
### Phase 5:异步任务统一(2 天,数据库方案确定后实施)
|
||||
|
||||
| # | 任务 | 输出文件 | 检查方式 |
|
||||
|---|------|---------|---------|
|
||||
| 5.1 | SQLAlchemy `Job` model(独立设计) | `app/models/job.py` | Alembic 迁移成功 |
|
||||
| 5.2 | Pydantic `JobResponse` Schema | `app/schemas/job.py` | 覆盖所有字段 |
|
||||
| 5.3 | `JobRegistry` 改为先写数据库、再写 Redis | `app/scheduler/registry.py` | 数据库有数据 |
|
||||
| 5.4 | `JobStatus` 扩展为 6 种状态 | `app/schemas/enums.py` | 覆盖所有场景 |
|
||||
| 5.5 | `ViduHandler` 接入 Async Engine | `app/scheduler/handlers/vidu_handler.py` | 对口型任务走 Engine |
|
||||
| 5.6 | `SubtitleHandler` 改为通过 Gateway 调用 | `app/scheduler/handlers/subtitle_handler.py` | 字幕任务走 Gateway |
|
||||
| 5.7 | 统一回调入口 `/webhooks/{platform}` | `app/api/v1/webhooks.py` | Vidu 回调正常 |
|
||||
| 5.8 | 删除 Router 中私设 Redis key 的代码 | `app/api/v1/vidu.py` | 无 `vidu:lipsync:` 字样 |
|
||||
| 5.9 | 统一任务 API `/jobs/{job_id}` | `app/api/v1/jobs.py` | GET 返回标准格式 |
|
||||
| 5.10 | 脚本生成从 SSE 改为异步任务 | `app/api/v1/script.py` / `app/services/script_service.py` | POST /jobs 提交,轮询 /jobs/{id} |
|
||||
| 5.11 | 删除 `/script/generate/stream` SSE 端点 | `app/api/v1/script.py` | 端点不存在 |
|
||||
|
||||
**验收标准**:
|
||||
- [ ] Vidu 对口型任务提交后,Redis 中只有 `job:{uuid}` 格式的 key
|
||||
- [ ] 应用重启后,从数据库恢复 running 任务继续执行
|
||||
- [ ] 前端轮询 `/jobs/{id}` 获取所有异步任务状态
|
||||
|
||||
### Phase 6:清理与验证(0.5 天)
|
||||
|
||||
| # | 任务 | 输出文件 | 检查方式 |
|
||||
|---|------|---------|---------|
|
||||
| 6.1 | `importlinter` 配置(禁止 Router 直接 import Provider) | `.importlinter` | CI 中运行通过 |
|
||||
| 6.2 | 删除废弃的 `ai_models.yaml`(确认合并完成后) | — | 文件不存在 |
|
||||
| 6.3 | 删除 `ViduService` / `VolcengineCaptionService` 中的重复异常处理 | 对应文件 | 代码审查 |
|
||||
| 6.4 | 全量回归测试(所有现有 API 调用一遍) | — | 测试脚本通过 |
|
||||
| 6.5 | 更新本文档,标记各阶段完成状态 | 本文档 | 所有 checkbox 打勾 |
|
||||
|
||||
---
|
||||
|
||||
## 四、检查清单汇总
|
||||
|
||||
### 4.1 新增文件清单
|
||||
|
||||
| 文件路径 | 说明 | 所属阶段 |
|
||||
|---------|------|---------|
|
||||
| `app/ai/adapters/__init__.py` | Adapter 包 | Phase 0 |
|
||||
| `app/ai/adapters/base.py` | Protocol + dataclass | Phase 2 |
|
||||
| `app/ai/adapters/constants.py` | Method 常量 | Phase 2 |
|
||||
| `app/ai/adapters/volcengine_ark.py` | 火山方舟 Adapter | Phase 4 |
|
||||
| `app/ai/adapters/vidu.py` | Vidu Adapter | Phase 4 |
|
||||
| `app/ai/adapters/volcengine_caption.py` | 火山字幕 Adapter | Phase 4 |
|
||||
| `app/ai/gateways/llm_gateway.py` | LLM 网关 | Phase 4 |
|
||||
| `app/platform_gateway.py` | 统一平台网关 | Phase 0/4 |
|
||||
| `app/core/runtime_config.py` | 运行时配置 + 热重载 | Phase 2 |
|
||||
| `config/platform-config.yaml` | 合并后的平台配置 | Phase 2 |
|
||||
| `app/models/job.py` | 异步任务数据库模型 | Phase 5 |
|
||||
| `app/api/v1/jobs.py` | 统一任务 API | Phase 5 |
|
||||
| `app/api/v1/webhooks.py` | 统一回调入口 | Phase 5 |
|
||||
| `app/scheduler/handlers/vidu_handler.py` | Vidu 任务处理器 | Phase 5 |
|
||||
| `.importlinter` | 架构约束配置 | Phase 6 |
|
||||
|
||||
### 4.2 修改文件清单
|
||||
|
||||
| 文件路径 | 修改内容 | 所属阶段 |
|
||||
|---------|---------|---------|
|
||||
| `app/core/exceptions.py` | 新增 `PlatformError` / `PlatformErrorType` | Phase 1 |
|
||||
| `app/main.py` | 注册异常中间件、lifespan Client 管理 | Phase 1/3/4 |
|
||||
| `app/config.py` | 清理非敏感配置,只保留密钥 | Phase 2 |
|
||||
| `app/ai/providers/vidu_provider.py` | aiohttp → httpx | Phase 3 |
|
||||
| `app/services/volcengine_caption_service.py` | 删除懒加载,改为注入 Client | Phase 3 |
|
||||
| `app/api/v1/voice.py` | 临时 Client → 共享 Client | Phase 3 |
|
||||
| `app/api/v1/script.py` | SSE → 异步任务 + 删除 stream 端点 | Phase 5 |
|
||||
| `app/services/script_service.py` | 删除 generate_script_stream | Phase 5 |
|
||||
| `app/api/v1/system.py` | 新增 Admin API | Phase 2 |
|
||||
| `app/scheduler/registry.py` | 先写数据库再写 Redis | Phase 5 |
|
||||
| `app/scheduler/handlers/subtitle_handler.py` | 通过 Gateway 调用 | Phase 5 |
|
||||
| `app/api/v1/vidu.py` | 删除私设 Redis key | Phase 5 |
|
||||
| `app/schemas/enums.py` | 扩展 `JobStatus` | Phase 5 |
|
||||
| `Makefile` / `pyproject.toml` | lint 规则 | Phase 1/3/6 |
|
||||
|
||||
### 4.3 废弃文件清单
|
||||
|
||||
| 文件路径 | 废弃原因 | 处理时间 |
|
||||
|---------|---------|---------|
|
||||
| `config/ai_models.yaml` | 合并到 `platform-config.yaml` | Phase 6 |
|
||||
|
||||
---
|
||||
|
||||
## 五、风险项与应对
|
||||
|
||||
| 风险 | 影响 | 概率 | 应对 |
|
||||
|-----|------|------|------|
|
||||
| `aiohttp` 迁移到 `httpx` 导致 Vidu 某些边缘场景行为不一致 | 功能回归 | 中 | 迁移后全量测试 Vidu TTS/对口型/克隆 |
|
||||
| `PlatformError` 未覆盖所有异常路径,仍有裸 Exception 漏出 | 前端收到 500 无法处理 | 低 | `make lint-semantic` 强制检查 + Code Review |
|
||||
| 配置热重载导致运行时行为突变 | 线上限流突然变更 | 低 | Admin API 加操作日志,变更前确认 |
|
||||
| Phase 5 数据库改造影响现有 Async Engine | 字幕/脚本任务异常 | 中 | 数据库方案评审后再实施,分步迁移 |
|
||||
| 前端轮询改造工作量超预期 | 延期 | 中 | 提前与前端同步接口变更,预留 2 天 |
|
||||
|
||||
---
|
||||
|
||||
## 六、验收标准(最终 Checklist)
|
||||
|
||||
实施全部完成后,按以下清单逐项核对:
|
||||
|
||||
- [ ] `PlatformError` 是 `app/services/` 和 `app/ai/` 中唯一的第三方异常类型
|
||||
- [ ] Router 中不存在 `except Exception: raise HTTPException(500)` 处理第三方错误
|
||||
- [ ] 新增 MockAdapter 实现 Protocol,30 分钟内完成注册并可用
|
||||
- [ ] `aiohttp` 不在项目直接依赖中(`pip show aiohttp` 不显示或仅为间接依赖)
|
||||
- [ ] 所有 `AsyncClient` 在 lifespan 中创建和销毁
|
||||
- [ ] 关闭应用时无 `unclosed client session` 警告
|
||||
- [ ] `config/platform-config.yaml` 存在且包含所有平台配置
|
||||
- [ ] 修改 `platform-config.yaml` 中的限流参数,10 秒内新请求生效
|
||||
- [ ] Admin API `/admin/runtime-config/reload` 手动触发重载成功
|
||||
- [ ] 健康检查 `/system/platform-health` 返回所有平台状态
|
||||
- [ ] `importlinter` CI 检查通过(Router 不直接 import Provider)
|
||||
- [ ] 全量 API 回归测试通过
|
||||
|
||||
---
|
||||
|
||||
> 本文档为实施的唯一依据。任何偏离文档的变更必须在此文档中登记并说明理由。
|
||||
Reference in New Issue
Block a user