refactor: 第三方平台架构改造（Adapter Protocol + Gateway）

Phase 1: 异常体系统一 - 新增 PlatformError / PlatformErrorType 标准定义 - 改造所有 Provider 异常抛出为 PlatformError - 注册全局 PlatformError exception handler Phase 2: Adapter Protocol - 新增 app/ai/adapters/base.py（PlatformAdapter + SyncCapable + TaskCapable + CallbackCapable） - 新增 app/ai/adapters/constants.py（Method 常量） - 新增 PlatformConfigLoader（config/platform-config.yaml） Phase 3: HTTP Client 统一 - ViduProvider 从 aiohttp 迁移到 httpx（注入方式） - VolcengineCaptionService 改为注入 http_client - lifespan 统一管理所有 Client 创建和关闭 Phase 4: Gateway 骨架 + Adapter 实现 - 新增 ViduAdapter / VolcengineArkAdapter / VolcengineCaptionAdapter - 新增 PlatformGateway（call_sync / submit_task / query_task / handle_webhook） - 新增 LLMGateway（带 Fallback 降级链） - lifespan 注册所有 Adapter 和 Gateway Phase 6: 清理与验证 - 从 Settings 移除 VIDU_BASE_URL / VOLCENGINE_BASE_URL - Provider 改为从 PlatformConfigLoader 读取 base_url - 清理 volcengine_caption_service 全局单例 - config_loader 默认路径改为 platform-config.yaml - Scheduler 注入共享 HTTP client - vidu.py 回调路由使用 Adapter 验签和解析 - ruff 全量通过，应用启动测试通过
2026-05-04 16:07:16 +08:00
parent 0c921aca11
commit e58159fc42
34 changed files with 3688 additions and 1030 deletions
@@ -0,0 +1,704 @@
+# 第三方平台接入架构设计方案（最终版）
+
+> 版本：v1.0 Final
+> 适用范围：`python-api/` 所有第三方服务接入层
+> 生效日期：2026-05-02
+
+---
+
+## 一、设计目标
+
+| 目标 | 验收标准 |
+|------|---------|
+| 新增平台接入成本 < 30 分钟 | 提供 Adapter 模板，复制粘贴后填充 4 个方法即可 |
+| 第三方故障不拖垮用户 | 单点故障时，用户 100ms 内收到明确错误，而非超时 30 秒 |
+| 多用户同时使用无冲突 | 5 个用户同时生成 TTS/脚本时，不触发第三方 429 限流 |
+| 任务状态可追踪 | 用户关闭应用后重开，能恢复进行中的对口型/字幕任务 |
+| 未来换平台无感知 | 换 TTS 供应商时，前端接口、存储层、用户历史记录全部无感知 |
+
+---
+
+## 二、整体架构
+
+```
+┌──────────────────────────────────────────────┐
+│  Router（FastAPI）                            │
+│  - 校验输入、序列化输出                         │
+│  - 统一错误中间件                              │
+│  - 不处理重试、不限流、不直接调第三方            │
+│  - 回调入口：/webhooks/{platform}              │
+├──────────────────────────────────────────────┤
+│  Application Service                          │
+│  - ScriptService：编排脚本→TTS→对口型          │
+│  - VideoService：编排字幕→合成                 │
+│  - 只操作领域对象，不感知平台差异                │
+├──────────────────────────────────────────────┤
+│  Gateway Layer                                │
+│  ┌─────────────────┐  ┌──────────────────┐   │
+│  │  LLM Gateway    │  │  Task Gateway    │   │
+│  │  - 模型路由      │  │  - 任务状态机     │   │
+│  │  - Fallback     │  │  - 轮询调度       │   │
+│  │  - 流式代理      │  │  - 回调处理       │   │
+│  └────────┬────────┘  └────────┬─────────┘   │
+│           │                    │              │
+│           └────────────────────┘              │
+│                    │                          │
+│           ┌────────▼────────┐                 │
+│           │  Shared Infra   │                 │
+│           │  - Token Bucket │                 │
+│           │  - CircuitBreaker│                │
+│           │  - stamina retry │                │
+│           │  - Structured Log│               │
+│           └────────┬────────┘                 │
+├────────────────────┼─────────────────────────┤
+│  Adapter Layer     │                         │
+│  - VolcengineArkAdapter                      │
+│  - OpenAIAdapter                              │
+│  - ViduAdapter                                │
+│  - VolcengineCaptionAdapter                   │
+│  - MockAdapter                                │
+│  每个 Adapter：Protocol 约定，无状态，可替换   │
+├──────────────────────────────────────────────┤
+│  Transport Layer                              │
+│  - httpx.AsyncClient（所有 Raw HTTP）         │
+│  - 官方 SDK（仅 LLM 层：AsyncArk/AsyncOpenAI）│
+│  - lifespan 显式创建、显式关闭                │
+└────────────────────────────────────────────────┘
+```
+
+---
+
+## 三、分层设计
+
+### 3.1 Adapter 层
+
+**职责**：纯翻译。把内部标准请求 ↔ 供应商特定请求，内部标准响应 ↔ 供应商特定响应。
+
+**不职责**：重试、限流、业务逻辑、状态管理。
+
+**Protocol 约定**：
+
+```python
+class LLMAdapter(Protocol):
+    platform_id: str
+    async def chat(self, messages, model, **params) -> AdapterResponse: ...
+    async def chat_stream(self, messages, model, **params): ...  # AsyncIterator
+    async def health(self) -> AdapterResponse: ...
+    async def close(self) -> None: ...
+
+class TaskAdapter(Protocol):
+    platform_id: str
+    async def submit(self, task_type, payload, callback_url) -> AdapterResponse: ...
+    async def query(self, platform_task_id) -> TaskStatus: ...
+    async def parse_callback(self, body) -> TaskStatus: ...
+    async def verify_signature(self, headers, body, secret) -> bool: ...
+    async def extract_nonce(self, headers) -> str | None: ...
+    async def health(self) -> AdapterResponse: ...
+    async def close(self) -> None: ...
+```
+
+**AdapterResponse 标准格式**：
+
+```python
+@dataclass(frozen=True)
+class AdapterResponse:
+    success: bool
+    data: dict | None = None
+    error_code: str | None = None
+    error_message: str | None = None
+    retryable: bool = False  # Gateway 据此决定是否重试
+```
+
+**TaskStatus 标准格式**：
+
+```python
+@dataclass(frozen=True)
+class TaskStatus:
+    task_id: str           # 供应商 task_id
+    state: str             # "pending" | "processing" | "completed" | "failed"
+    result: dict | None = None
+    error_message: str | None = None
+```
+
+**Client 统一**：
+- 所有 Raw HTTP 用 `httpx.AsyncClient`。
+- LLM 官方 SDK（AsyncArk、AsyncOpenAI）保留，但 lifespan shutdown 时显式 `close()`。
+- 每个 Adapter 独立 Client，独立连接池，互不干扰。
+
+---
+
+### 3.2 Gateway 层
+
+#### 3.2.1 LLM Gateway
+
+```python
+class LLMGateway:
+    def __init__(self, adapters: dict[str, LLMAdapter], runtime_config: GatewayRuntimeConfig):
+        self.adapters = adapters
+        self.config = runtime_config
+
+    async def chat(self, model_id, messages, **params) -> dict:
+        # 1. 路由到 Adapter
+        # 2. 主模型失败时 Fallback
+        # 3. 流式中途失败不再 Fallback
+```
+
+**Fallback 规则**：
+- 配置驱动，`runtime_config.ark_fallback_chain`
+- 流式中途失败 → 立即抛异常，不降级（避免内容混合）
+- 同步调用失败 → 按链降级，对用户透明
+
+#### 3.2.2 Task Gateway
+
+```python
+class TaskGateway:
+    def __init__(self, adapters, storage, runtime_config):
+        self.adapters = adapters
+        self.storage = storage          # Redis
+        self.config = runtime_config
+        self.circuit = CircuitBreaker()
+
+    async def submit(self, platform_id, task_type, payload, callback_url=None) -> str:
+        # 1. 限流检查
+        # 2. 熔断检查
+        # 3. Adapter.submit() → 获取 platform_task_id
+        # 4. 生成 internal_task_id (UUID)
+        # 5. Redis 存储映射
+        # 6. 返回 internal_task_id
+
+    async def query(self, internal_task_id) -> TaskStatus:
+        # 1. 查 Redis 映射
+        # 2. 非终态时穿透供应商查询（可选）
+        # 3. 更新 Redis
+
+    async def handle_webhook(self, platform, headers, body, query):
+        # 1. nonce 防重放检查
+        # 2. Adapter.verify_signature()
+        # 3. Adapter.parse_callback()
+        # 4. 更新任务状态
+```
+
+**内部 ID 隔离**：
+
+```python
+# Redis 存储结构
+task:{internal_task_id} -> {
+    "platform_id": "vidu",
+    "platform_task_id": "vidu_abc123",
+    "task_type": "lip_sync",
+    "state": "processing",
+    "submitted_at": "2026-05-02T12:00:00Z"
+}
+TTL: 3600 秒（1 小时）
+```
+
+**轮询调度器**（火山字幕示例）：
+
+```python
+async def poll_until_complete(self, internal_task_id, max_wait=120):
+    intervals = [0, 1, 2, 4, 8, 8, 10]  # 非阻塞阶段
+
+    for interval in intervals:
+        await asyncio.sleep(interval)
+        status = await self.query(internal_task_id)
+        if status.state == "completed":
+            return status
+        if status.state == "failed":
+            raise TaskError(status.error_message)
+
+    # 切换 blocking 阶段
+    while elapsed < max_wait:
+        status = await self._query_with_blocking(internal_task_id)
+        if status.state in ("completed", "failed"):
+            return status
+
+    raise TaskError("任务超时")
+```
+
+#### 3.2.3 Shared Infra
+
+**Token Bucket**（内存级，`aiolimiter`）：
+
+```python
+vidu_limiter = AsyncLimiter(max_rate=20, time_period=1.0)   # 20/s
+caption_limiter = AsyncLimiter(max_rate=2, time_period=1.0)  # 2/s
+ark_limiter = AsyncLimiter(max_rate=50, time_period=1.0)     # 50/s
+```
+
+**CircuitBreaker**：
+
+```python
+class CircuitBreaker:
+    failure_threshold: int = 5      # 连续失败 5 次熔断
+    recovery_timeout: float = 60.0  # 60 秒后探测恢复
+```
+
+**Retry Policy**（`stamina`）：
+
+```python
+with stamina.retry_context(
+    on=(httpx.NetworkError, httpx.TimeoutException),
+    attempts=3,
+    timeout=30.0,
+    wait_initial=1.0,
+    wait_max=10.0,
+):
+    await adapter.submit(...)
+```
+
+---
+
+### 3.3 Application Service 层
+
+**职责**：编排业务流程，不感知平台差异。
+
+```python
+class ScriptService:
+    async def generate_script(self, category, subcategory, duration):
+        # 调用 LLM Gateway，不关心底层是火山方舟还是 OpenAI
+        result = await llm_gateway.chat(
+            model_id="doubao-seed-2-0-pro",
+            messages=[...],
+            temperature=0.7,
+        )
+        return self._parse_shots(result.data["content"])
+
+class VideoService:
+    async def submit_lip_sync(self, video_url, audio_url):
+        # 调用 Task Gateway，不关心底层是 Vidu 还是 HeyGen
+        task_id = await task_gateway.submit(
+            platform_id="vidu",
+            task_type="lip_sync",
+            payload={"video_url": video_url, "audio_url": audio_url},
+            callback_url=f"{settings.app_base_url}/webhooks/vidu",
+        )
+        return task_id
+```
+
+---
+
+### 3.4 Router 层
+
+**职责**：HTTP 语义转换，参数校验，统一返回格式。
+
+**统一错误中间件**：
+
+```python
+@app.exception_handler(PlatformError)
+async def platform_error_handler(request, exc: PlatformError):
+    status = 502 if exc.retryable else 400
+    return JSONResponse(
+        status_code=status,
+        content={
+            "code": exc.status_code or 500,
+            "message": str(exc),
+            "data": None,
+            "detail": {
+                "platform": exc.platform,
+                "retryable": exc.retryable,
+            } if settings.DEBUG else None,
+        },
+    )
+```
+
+**回调入口**：
+
+```python
+@router.post("/webhooks/{platform}")
+async def universal_webhook(platform: str, request: Request):
+    raw_headers = dict(request.headers)
+    raw_body = await request.body()
+    query_params = dict(request.query_params)
+
+    await task_gateway.handle_webhook(
+        platform=platform,
+        headers=raw_headers,
+        body=raw_body,
+        query=query_params,
+        original_path=request.url.path,
+    )
+    return {"received": True}
+```
+
+---
+
+## 四、核心数据流
+
+### 4.1 TTS 语音合成（同步调用）
+
+```
+用户点击"生成配音"
+    ↓
+POST /voice/synthesize
+    ↓
+Router 校验参数
+    ↓
+ViduService.synthesize(text, voice_id...)
+    ↓
+LLM Gateway.call_sync(platform="vidu", method="tts", ...)
+    ↓
+Token Bucket 取令牌（rate=20/s）
+    ↓
+stamina 重试网络错误（最多3次）
+    ↓
+ViduAdapter.call(method="tts", ...)
+    ↓
+httpx.AsyncClient → Vidu API
+    ↓
+返回音频 URL
+```
+
+**异常路径**：
+- 网络错误 → stamina 重试 → 3 次失败后抛 PlatformError(retryable=True) → 502
+- Vidu 返回 400 → PlatformError(retryable=False) → 400
+- Vidu 返回 500 → PlatformError(retryable=True) → 502
+
+### 4.2 脚本生成 SSE（流式调用）
+
+```
+用户点击"生成脚本"
+    ↓
+POST /script/generate/stream
+    ↓
+ScriptService.generate_script_stream(...)
+    ↓
+LLM Gateway.chat_stream(model_id="doubao-seed-2-0-pro", ...)
+    ↓
+VolcengineArkAdapter.chat_stream(...)
+    ↓
+SSE 流式输出
+```
+
+**关键约束**：流式中途失败**不降级**。已输出内容保持不变，前端收到 error 事件后自行处理。
+
+### 4.3 对口型任务提交（异步任务）
+
+```
+用户点击"生成对口型"
+    ↓
+POST /vidu/lip-sync
+    ↓
+VideoService.submit_lip_sync(...)
+    ↓
+Task Gateway.submit(platform="vidu", task_type="lip_sync", ...)
+    ↓
+Token Bucket 取令牌（rate=5/s）
+    ↓
+CircuitBreaker 检查
+    ↓
+ViduAdapter.submit(method="lip_sync", ...)
+    ↓
+返回 platform_task_id
+    ↓
+生成 internal_task_id (UUID)
+    ↓
+Redis 存储映射
+    ↓
+返回 {task_id: internal_task_id}
+```
+
+### 4.4 字幕生成轮询（异步任务 + 轮询）
+
+```
+用户点击"生成字幕"
+    ↓
+POST /caption/generate
+    ↓
+CaptionService.generate_caption(...)
+    ↓
+Task Gateway.submit(platform="volcengine_caption", ...)
+    ↓
+返回 internal_task_id
+    ↓
+前端轮询 /tasks/{id}/status
+    ↓
+Gateway.query → Redis 命中 → 返回
+    ↓
+Redis 未命中或 state=processing → 穿透供应商查询 → 更新 Redis
+```
+
+**轮询策略**（火山字幕）：
+- 第 1 次：t=0s，blocking=0
+- 第 2 次：t=1s，blocking=0
+- 第 3 次：t=3s，blocking=0
+- 第 4 次起：t=7s, 12s, 17s...，blocking=1
+
+### 4.5 回调处理（Vid 对口型完成）
+
+```
+Vidu 服务器 POST /webhooks/vidu
+    ↓
+Router 提取 headers / body / query
+    ↓
+Task Gateway.handle_webhook(...)
+    ↓
+1. ViduAdapter.extract_nonce(headers) → nonce
+   → Redis 查 nonce 是否已用
+   → 已用 → 401
+2. ViduAdapter.verify_signature(headers, body, secret)
+   → 失败 → 401
+3. Redis 标记 nonce 已用（TTL 300s）
+4. ViduAdapter.parse_callback(body) → TaskStatus
+5. Redis 更新任务状态
+```
+
+---
+
+## 五、并发控制
+
+### 5.1 三层隔离模型
+
+```
+┌─────────────────────────────────────────┐
+│  第一层：任务层（Slot Scheduler）         │
+│  控制"同时有多少个异步任务在执行"          │
+│  - 火山字幕：max 5                        │
+│  - 对口型：按 Vidu 配额配置               │
+│  - 脚本生成：max 10                       │
+├─────────────────────────────────────────┤
+│  第二层：请求层（Gateway Token Bucket）  │
+│  控制"每秒向某平台发多少请求"              │
+│  - Vidu TTS：20/s                         │
+│  - Vidu 对口型提交：5/s                   │
+│  - 火山方舟：50/s                         │
+│  - 火山字幕提交：2/s                      │
+├─────────────────────────────────────────┤
+│  第三层：连接层（HTTP Client Pool）      │
+│  控制"同时保持多少条 TCP 连接"            │
+│  - Vidu：max 20                           │
+│  - 火山字幕：max 10                       │
+│  - 火山方舟：SDK 内部管理                 │
+└─────────────────────────────────────────┘
+```
+
+### 5.2 流式连接单独计数
+
+```python
+# LLM Gateway 内
+active_streams: dict[str, int] = {}  # {platform: count}
+
+# 流式上限
+MAX_STREAMS = {
+    "volcengine_ark": 30,
+    "openai": 30,
+}
+```
+
+---
+
+## 六、错误处理
+
+### 6.1 异常类
+
+```python
+class PlatformError(Exception):
+    """第三方平台调用失败"""
+    def __init__(self, message, *, platform: str, retryable: bool = False, status_code: int | None = None):
+        super().__init__(message)
+        self.platform = platform
+        self.retryable = retryable
+        self.status_code = status_code
+
+class TaskError(Exception):
+    """任务生命周期错误"""
+    pass
+
+class LLMError(Exception):
+    """LLM 调用失败（含 Fallback 耗尽）"""
+    pass
+```
+
+### 6.2 HTTP 状态码映射
+
+| 场景 | PlatformError 属性 | HTTP 状态码 |
+|------|-------------------|------------|
+| 网络超时、DNS 失败、5xx | `retryable=True` | 502 Bad Gateway |
+| 供应商限流 429 | `retryable=True` | 429 Too Many Requests |
+| 认证失败 401/403 | `retryable=False` | 401 Unauthorized |
+| 参数错误 400 | `retryable=False` | 400 Bad Request |
+| 业务逻辑错误（state=failed） | `retryable=False` | 400 Bad Request |
+
+### 6.3 全局响应格式
+
+```json
+{
+    "code": 0,
+    "message": "成功",
+    "data": {}
+}
+```
+
+错误时：
+
+```json
+{
+    "code": 500,
+    "message": "Vidu TTS 服务暂不可用",
+    "data": null,
+    "detail": {
+        "platform": "vidu",
+        "retryable": true
+    }
+}
+```
+
+---
+
+## 七、配置规范
+
+### 7.1 嵌套配置模型
+
+```python
+class ViduConfig(BaseModel):
+    api_key: str = ""
+    base_url: str = "https://api.vidu.com"
+    max_connections: int = 20
+    timeout: float = 30.0
+
+class RuntimeConfig(BaseModel):
+    """运行时配置，支持热重载"""
+    vidu_qps: float = 20.0
+    vidu_burst: int = 30
+    ark_fallback_chain: list[str] = ["doubao-seed-2-0-lite"]
+    caption_poll_intervals: list[float] = [1.0, 1.0, 2.0, 2.0, 4.0, 4.0, 8.0, 8.0, 10.0]
+    circuit_failure_threshold: int = 5
+    circuit_recovery_timeout: float = 60.0
+
+class Settings(BaseSettings):
+    vidu: ViduConfig = Field(default_factory=ViduConfig)
+    volcengine_ark: VolcengineArkConfig = Field(default_factory=VolcengineArkConfig)
+    volcengine_caption: VolcengineCaptionConfig = Field(default_factory=VolcengineCaptionConfig)
+    openai: OpenAIConfig = Field(default_factory=OpenAIConfig)
+    runtime: RuntimeConfig = Field(default_factory=RuntimeConfig)
+
+    model_config = SettingsConfigDict(
+        env_nested_delimiter="__",
+    )
+```
+
+### 7.2 `.env` 示例
+
+```bash
+# === 启动配置（改后需重启）===
+VIDU__API_KEY=sk-xxx
+VIDU__BASE_URL=https://api.vidu.com
+VIDU__MAX_CONNECTIONS=20
+
+VOLCENGINE_ARK__API_KEY=ak-xxx
+VOLCENGINE_CAPTION__APPID=app-xxx
+VOLCENGINE_CAPTION__TOKEN=tk-xxx
+
+OPENAI__API_KEY=sk-xxx
+
+# === 运行时配置（改后可热载）===
+RUNTIME__VIDU__QPS=20
+RUNTIME__VIDU__BURST=30
+RUNTIME__ARK__FALLBACK_CHAIN=doubao-seed-2-0-lite,doubao-lite-32k
+RUNTIME__CAPTION__POLL_INTERVALS=1,1,2,2,4,4,8,8,10
+RUNTIME__CIRCUIT__FAILURE_THRESHOLD=5
+```
+
+### 7.3 热重载 API
+
+```python
+@router.post("/admin/runtime-config")
+async def reload_runtime_config(updates: dict):
+    gateway_registry.update_runtime_config(**updates)
+    return {"updated": list(updates.keys())}
+```
+
+---
+
+## 八、日志与可观测性
+
+### 8.1 结构化日志字段
+
+```python
+{
+    "event": "platform_call",
+    "platform": "vidu",
+    "method": "tts_sync",
+    "task_type": "tts",
+    "duration_ms": 1250,
+    "success": true,
+    "http_status": 200,
+    "retry_count": 0,
+}
+```
+
+### 8.2 脱敏规则
+
+| 级别 | 字段 | 生产环境处理 |
+|------|------|------------|
+| P1 | `api_key`, `authorization`, `x-hmac-signature` | `[REDACTED]` |
+| P2 | `audio_url`, `video_url`, `text` | URL 去签名参数 / 文案截断前 30 字 |
+| P3 | `platform_task_id`, `internal_task_id` | 前缀保留 8 字符 |
+| P4 | `duration_ms`, `http_status`, `retry_count` | 完整保留 |
+
+### 8.3 健康检查端点
+
+```python
+@router.get("/system/platform-health")
+async def platform_health():
+    results = {}
+    for pid, adapter in registry.adapters.items():
+        resp = await adapter.health()
+        results[pid] = {
+            "available": resp.success,
+            "error": resp.error_message,
+        }
+    return results
+```
+
+---
+
+## 九、迁移策略
+
+### 9.1 迁移原则
+
+- **新旧代码并行**：通过 flag 切换，可随时回滚
+- **逐个平台迁移**：Vidu → 火山字幕 → LLM
+- **前端无感知**：Router URL、请求体、响应体不变
+
+### 9.2 Flag 切换机制
+
+```python
+# Router 层
+USE_NEW_VIDU = settings.FEATURE_FLAGS.get("new_vidu_adapter", False)
+
+@router.post("/voice/synthesize")
+async def synthesize(request: TTSSynthesizeRequest):
+    if USE_NEW_VIDU:
+        service = get_vidu_service_v2()  # 新架构
+    else:
+        service = get_vidu_service()       # 旧代码
+    ...
+```
+
+### 9.3 迁移 Checklist
+
+| 步骤 | 动作 | 验证 |
+|------|------|------|
+| 1 | 新建 `ViduAdapterV2`，实现 Protocol | 单元测试通过 |
+| 2 | 注册到 Gateway，flag 关闭 | 不影响线上 |
+| 3 | 测试环境开启 flag，全量回归 | 所有 Vidu 接口正常 |
+| 4 | 生产灰度 10% → 50% → 100% | 监控 error rate |
+| 5 | 旧代码保留 1 周后删除 | 无回滚需求 |
+
+---
+
+## 十、附录：最终决策清单
+
+| # | 决策项 | 结论 |
+|---|--------|------|
+| 1 | 回调验签位置 | **C**：Router 提取纯数据 → Gateway 调度 → Adapter 验签 |
+| 2 | 任务结果保留 | 实时反映，Redis 映射 TTL = 1h |
+| 3 | 七牛云 | 不纳入新架构 |
+| 4 | SSE 断线 | 不支持续传 |
+| 5 | MockAdapter | 仅 `DEBUG=True` 时注册 |
+| 6 | 配置热重载 | **B**：限流参数 + Fallback 链可热载，Adapter 需重启 |
+| 7 | 日志脱敏 | 四级分级（P1/P2/P3/P4）+ 四档环境 + URL 智能剥离 |
+| 8 | 火山字幕轮询 | **B**：前 3 次非阻塞（0→1→3s）+ 后切换 `blocking=1` |
+| 9 | 迁移策略 | **C**：适配器层先行，flag 切换 |
+| 10 | API Key | 手动维护 |
+| 11 | 任务状态持久化 | **C**：Redis 开启 AOF 持久化 |
@@ -0,0 +1,532 @@
+# 第三方平台接入架构标准化实施计划
+
+> 版本：v1.0
+> 依据：third-party-integration-architecture.md（架构设计最终版）
+> 目标：统一异常体系、Adapter 契约、HTTP Client 生命周期、异步任务状态机、配置分层
+> 生效日期：2026-05-03
+
+---
+
+## 一、总则
+
+### 1.1 实施原则
+
+- **标准优先**：以行业主流做法为准，不因"改动小"而妥协
+- **文档先行**：所有变更必须在此文档中登记，实施完成后逐项核对
+- **标准优先**：存量代码直接按标准改造，不做兼容包装层
+- **渐进验证**：每阶段完成后运行测试，确认无回归再进入下一阶段
+
+### 1.2 不适用本标准的例外
+
+- 七牛云存储（纯上传下载，不纳入 Adapter 体系）
+
+---
+
+## 二、五条铁律规范（实施标准）
+
+### 铁律 1：异常出口唯一
+
+**规范内容**：所有 `app/services/`、`app/ai/` 下的代码，对外抛出的异常必须是 `PlatformError`。Router 只 `except PlatformError` 和 `AppException`（业务错误）。
+
+**具体标准**：
+
+```python
+# app/core/exceptions.py —— 唯一的第三方异常类
+class PlatformErrorType:
+    RATE_LIMIT = "rate_limit"
+    AUTH_FAILED = "auth_failed"
+    TIMEOUT = "timeout"
+    SERVER_ERROR = "server_error"
+    BAD_REQUEST = "bad_request"
+    QUOTA_EXHAUSTED = "quota_exhausted"
+    NOT_FOUND = "not_found"
+    UNKNOWN = "unknown"
+
+class PlatformError(Exception):
+    def __init__(
+        self,
+        message: str,
+        *,
+        platform: str,
+        retryable: bool = False,
+        error_type: str = PlatformErrorType.UNKNOWN,
+        status_code: int | None = None,
+    ):
+        super().__init__(message)
+        self.platform = platform
+        self.retryable = retryable
+        self.error_type = error_type
+        self.status_code = status_code
+```
+
+**HTTP 状态码映射**（全局中间件）：
+
+| error_type | retryable | HTTP 状态码 |
+|-----------|-----------|------------|
+| rate_limit | True | 429 |
+| timeout | True | 504 |
+| server_error | True | 502 |
+| auth_failed | False | 401 |
+| bad_request | False | 400 |
+| quota_exhausted | False | 429（带 Retry-After） |
+| unknown | False | 400 |
+
+**禁止事项**：
+- [ ] `app/services/` 和 `app/ai/` 中禁止 `raise HTTPException`
+- [ ] `app/services/` 和 `app/ai/` 中禁止裸 `raise Exception(...)`
+- [ ] 各 Router 中禁止 `except Exception: raise HTTPException(500, ...)` 处理第三方错误
+
+---
+
+### 铁律 2：Adapter 最小契约
+
+**规范内容**：每个新平台必须实现 `PlatformAdapter`（`platform_id` + `health()` + `close()`）。按需实现 `SyncCapable`（同步调用）或 `TaskCapable`（异步任务）。
+
+**Protocol 定义**：
+
+```python
+# app/ai/adapters/base.py
+
+@runtime_checkable
+class PlatformAdapter(Protocol):
+    """所有 Adapter 的准入门槛"""
+    platform_id: str
+    async def health(self) -> AdapterResponse: ...
+    async def close(self) -> None: ...
+
+@runtime_checkable
+class SyncCapable(Protocol):
+    """同步调用能力（TTS、Chat、图片生成等）"""
+    async def call(self, method: str, payload: dict) -> AdapterResponse: ...
+
+@runtime_checkable
+class TaskCapable(Protocol):
+    """异步任务能力（对口型、字幕、视频生成等）"""
+    async def submit(self, task_type: str, payload: dict, callback_url: str | None) -> AdapterResponse: ...
+    async def query(self, platform_job_id: str) -> TaskStatus: ...
+
+@runtime_checkable
+class CallbackCapable(Protocol):
+    """回调验签能力（可选）"""
+    async def verify_signature(self, headers: dict, body: bytes, secret: str) -> bool: ...
+    async def parse_callback(self, body: bytes) -> TaskStatus: ...
+```
+
+**统一返回值**：
+
+```python
+@dataclass(frozen=True)
+class AdapterResponse:
+    success: bool
+    data: dict | None = None
+    error_code: str | None = None
+    error_message: str | None = None
+    retryable: bool = False
+
+@dataclass(frozen=True)
+class TaskStatus:
+    state: str  # "pending" | "processing" | "completed" | "failed"
+    result: dict | None = None
+    error_message: str | None = None
+```
+
+**方法标识常量**：
+
+```python
+# app/ai/adapters/constants.py
+class Method:
+    TTS = "tts"
+    CHAT = "chat"
+    CHAT_STREAM = "chat_stream"
+    IMAGE_GENERATE = "image_generate"
+    LIP_SYNC = "lip_sync"
+    CAPTION = "caption"
+    AUTO_ALIGN = "auto_align"
+```
+
+**各平台对号入座**：
+
+| 平台 | 必须实现 | 当前状态 |
+|-----|---------|---------|
+| 火山方舟 | `PlatformAdapter + SyncCapable` | ❌ 缺失，需新建 `VolcengineArkAdapter` |
+| Vidu | `PlatformAdapter + SyncCapable + TaskCapable + CallbackCapable` | ❌ 缺失，需新建 `ViduAdapter` |
+| 火山字幕 | `PlatformAdapter + TaskCapable` | ❌ 缺失，需新建 `VolcengineCaptionAdapter` |
+
+**禁止事项**：
+- [ ] 新增平台不实现 `PlatformAdapter` 直接接入
+- [ ] Adapter 内部抛出的异常不是 `PlatformError`
+- [ ] `call()` 方法返回裸 `dict` 而不是 `AdapterResponse`
+
+---
+
+### 铁律 3：HTTP Client 统一关闭
+
+**规范内容**：所有对外 HTTP 连接（`httpx.AsyncClient` 或 SDK 内部）必须在 `lifespan` 中创建和销毁。禁止在方法内临时创建 `httpx.AsyncClient()`。
+
+**具体标准**：
+
+```python
+# lifespan 中的标准写法
+@asynccontextmanager
+async def lifespan(app: FastAPI):
+    # 各平台独立 Client（故障隔离）
+    app.state.http_clients = {
+        "vidu": httpx.AsyncClient(timeout=30, limits=httpx.Limits(max_connections=20)),
+        "volcengine_caption": httpx.AsyncClient(timeout=60, limits=httpx.Limits(max_connections=10)),
+        "default": httpx.AsyncClient(timeout=30, limits=httpx.Limits(max_connections=50)),
+    }
+    
+    # SDK 客户端
+    app.state.ark_client = AsyncArk(api_key=..., timeout=1800)
+    
+    yield
+    
+    # 统一关闭
+    for client in app.state.http_clients.values():
+        await client.aclose()
+    
+    if hasattr(app.state, "ark_client") and not app.state.ark_client.is_closed():
+        await app.state.ark_client.close()
+```
+
+**迁移清单**：
+
+| 文件 | 当前 Client | 整改方式 |
+|-----|------------|---------|
+| `app/ai/providers/vidu_provider.py` | `aiohttp.ClientSession` | 迁移为 `httpx.AsyncClient`，由 lifespan 注入 |
+| `app/services/volcengine_caption_service.py` | `httpx.AsyncClient`（懒加载，永不关闭） | 改为 lifespan 注入，删除 `_get_client()` 懒加载 |
+| `app/api/v1/voice.py` | 临时 `httpx.AsyncClient()` | 改为 `app.state.http_clients["default"]` |
+
+**禁止事项**：
+- [ ] 禁止 import `aiohttp`（项目统一使用 `httpx`）
+- [ ] 禁止在方法/路由内 `httpx.AsyncClient()` 临时创建（下载大文件例外，需注释说明）
+- [ ] 禁止 Provider `__init__` 中创建 Client 而不在 lifespan 中关闭
+
+---
+
+### 铁律 4：异步任务状态唯一
+
+**规范内容**：所有"提交后等待"的任务，状态必须写入统一的状态机。第三方回调走统一入口 `/webhooks/{platform}`。
+
+> **注意**：本铁律涉及数据库设计，当前文档中 SQLAlchemy `Job` model 相关设计已挂起，待数据库方案确定后补充。本章只规定接口和状态机标准。
+
+**统一状态枚举**：
+
+```python
+class JobStatus(str, Enum):
+    PENDING = "pending"       # 已提交，等待调度
+    QUEUED = "queued"         # 已进入队列，等待槽位
+    RUNNING = "running"       # 正在执行
+    SUCCEEDED = "succeeded"   # 成功完成
+    FAILED = "failed"         # 失败
+    CANCELLED = "cancelled"   # 用户取消或超时取消
+```
+
+**统一任务 API**：
+
+```python
+# 提交任务
+POST /jobs
+Request: {platform: str, task_type: str, payload: dict, idempotency_key: str | None}
+Response: {job_id: UUID, status: "pending"}
+
+# 查询任务
+GET /jobs/{job_id}
+Response: {job_id, status, progress, message, result, error, created_at, updated_at}
+
+# 统一回调入口
+POST /webhooks/{platform}
+```
+
+**第三方状态映射**（Adapter 层负责）：
+
+| 第三方状态 | 内部状态 |
+|-----------|---------|
+| Vidu: `pending` / `processing` | `RUNNING` |
+| Vidu: `success` | `SUCCEEDED` |
+| Vidu: `failed` | `FAILED` |
+| 火山字幕: `code=2000` | `RUNNING` |
+| 火山字幕: `code=0` | `SUCCEEDED` |
+| 火山字幕: `code=1001/1002/1012` | `FAILED`（不可重试） |
+| 火山字幕: `code=1003`（超频） | `FAILED`（可重试） |
+
+**禁止事项**：
+- [ ] 禁止 Router/Service 私设 Redis key（如 `vidu:lipsync:xxx`）
+- [ ] 禁止在 Router 中直接处理回调验签（必须由 Adapter 处理）
+- [ ] 禁止各平台使用自己的状态字符串返回给前端
+
+---
+
+### 铁律 5：配置与密钥分离
+
+**规范内容**：非敏感配置走 `config/platform-config.yaml`，支持热重载。密钥走 `.env`，修改需重启。
+
+**文件结构**：
+
+```yaml
+# config/platform-config.yaml
+platforms:
+  <platform_id>:
+    provider: <provider_type>
+    base_url: <url>
+    models:  # 原 ai_models.yaml 内容合并至此
+      - id: <model_alias>
+        model_name: <实际模型ID>
+        capabilities: [<capability>]
+        default_params: <dict>
+    rate_limit:
+      qps: <float>
+      burst: <int>
+    methods:
+      <method>:
+        timeout: <int>
+        max_connections: <int>
+        rate_limit:
+          qps: <float>
+          burst: <int>
+
+runtime:
+  fallback_chains:
+    <capability>:
+      - <model_alias_1>
+      - <model_alias_2>
+  task_timeouts:
+    <task_type>: <seconds>
+  task_ttl:
+    <task_type>: <seconds>
+```
+
+**热重载实现**：
+
+```python
+class RuntimeConfig:
+    """运行时配置，轮询检查 mtime（10秒间隔）+ Admin API 手动触发"""
+    
+    async def get(self, key: str, default=None):
+        await self._reload_if_changed()
+        return self._config.get(key, default)
+    
+    async def force_reload(self) -> bool:
+        """Admin API 调用"""
+        ...
+```
+
+**Admin API**：
+
+```python
+@router.post("/admin/runtime-config/reload")
+async def reload_runtime_config():
+    success = await runtime_config.force_reload()
+    return {"reloaded": success, "version": runtime_config.version}
+
+@router.get("/admin/runtime-config")
+async def get_runtime_config():
+    return runtime_config.get_raw()
+```
+
+**迁移清单**：
+
+| `.env` 中的配置项 | 迁移目标 | 状态 |
+|------------------|---------|------|
+| `VIDU_BASE_URL` | `platforms.vidu.base_url` | 待迁移 |
+| `VOLCENGINE_BASE_URL` | `platforms.volcengine_ark.base_url` | 待迁移 |
+| `VOLC_SUBTITLE_MAX_CONCURRENT` | `platforms.volcengine_caption.methods.caption.max_connections` | 待迁移 |
+| `VOLC_SUBTITLE_TIMEOUT` | `runtime.task_timeouts.caption` | 待迁移 |
+
+**禁止事项**：
+- [ ] 禁止在 `.env` 中存放非敏感配置（URL、超时、限流）
+- [ ] 禁止代码中硬编码配置（如 `timeout=30`、`max_rate=20`）
+- [ ] 禁止 `Settings` 类超过 150 行（逐步瘦身）
+
+---
+
+## 三、分阶段实施计划
+
+### Phase 0：准备（0.5 天）
+
+| # | 任务 | 输出文件 | 检查方式 |
+|---|------|---------|---------|
+| 0.1 | 新建 `app/ai/adapters/` 目录结构 | `app/ai/adapters/__init__.py` | 目录存在 |
+| 0.2 | 新建 `app/platform_gateway.py` 骨架 | `app/platform_gateway.py` | 文件存在，类定义完整 |
+| 0.3 | 安装/确认 `importlinter` 可用 | `pyproject.toml` 依赖 | `pip show importlinter` |
+| 0.4 | 备份现有 `exceptions.py` | git stash / 分支 | 可回滚 |
+
+### Phase 1：异常体系（0.5 天）
+
+| # | 任务 | 输出文件 | 检查方式 |
+|---|------|---------|---------|
+| 1.1 | 重构 `PlatformError` + `PlatformErrorType` | `app/core/exceptions.py` | 类型定义完整，含所有字段 |
+| 1.2 | 保留 `AppException` 体系（业务错误） | `app/core/exceptions.py` | 原有类不删除 |
+| 1.3 | `main.py` 注册 `PlatformError` 全局中间件 | `app/main.py` | 启动无报错，异常测试返回正确 HTTP 码 |
+| 1.4 | `VolcengineArkAdapter._wrap_error()` 实现异常映射 | `app/ai/adapters/volcengine_ark.py` | 单元测试覆盖 |
+| 1.5 | `ViduAdapter._wrap_error()` 实现异常映射 | `app/ai/adapters/vidu.py` | 单元测试覆盖 |
+| 1.6 | `make lint-semantic` 增加异常规则 | `Makefile` | 提交时自动检查 |
+
+**验收标准**：
+- [ ] 任意第三方调用失败，Router 返回的 JSON 中 `detail.retryable` 正确
+- [ ] 网络超时返回 504，限流返回 429，认证失败返回 401
+- [ ] 业务错误（如参数校验失败）仍走 `AppException` → 400/422
+
+### Phase 2：Adapter Protocol + 配置合并（1 天）
+
+| # | 任务 | 输出文件 | 检查方式 |
+|---|------|---------|---------|
+| 2.1 | `PlatformAdapter` / `SyncCapable` / `TaskCapable` / `CallbackCapable` Protocol | `app/ai/adapters/base.py` | `isinstance` 校验通过 |
+| 2.2 | `AdapterResponse` / `TaskStatus` dataclass | `app/ai/adapters/base.py` | frozen=True，字段完整 |
+| 2.3 | `Method` 常量定义 | `app/ai/adapters/constants.py` | 覆盖所有现有方法 |
+| 2.4 | 合并 `ai_models.yaml` → `platform-config.yaml` | `config/platform-config.yaml` | 原有模型列表完整迁移 |
+| 2.5 | `RuntimeConfig` 热重载实现 | `app/core/runtime_config.py` | mtime 轮询 + force_reload 均工作 |
+| 2.6 | Admin API `/admin/runtime-config/*` | `app/api/v1/system.py` | GET/POST 返回正确 |
+| 2.7 | `Settings` 类清理非敏感配置 | `app/config.py` | 只保留密钥，行数 < 150 |
+
+**验收标准**：
+- [ ] 新增一个 MockAdapter 实现 Protocol，IDE 自动提示缺失方法
+- [ ] 修改 `runtime.yaml` 中的 qps，10 秒内新请求生效
+- [ ] Admin API 手动触发 reload，返回最新配置
+
+### Phase 3：HTTP Client 统一（1 天）
+
+| # | 任务 | 输出文件 | 检查方式 |
+|---|------|---------|---------|
+| 3.1 | `ViduProvider` 从 `aiohttp` 迁移到 `httpx` | `app/ai/providers/vidu_provider.py` | 功能测试通过 |
+| 3.2 | `VolcengineCaptionService` 删除懒加载，改为注入 Client | `app/services/volcengine_caption_service.py` | 功能测试通过 |
+| 3.3 | `voice.py` 中临时 `httpx.AsyncClient()` 改为共享 Client | `app/api/v1/voice.py` | 代码审查 |
+| 3.4 | lifespan 统一管理所有 Client 生命周期 | `app/main.py` | 启动/关闭无泄漏日志 |
+| 3.5 | `ViduAdapter.close()` / `VolcengineCaptionAdapter.close()` 实现 | 对应 Adapter 文件 | lifespan shutdown 时调用 |
+| 3.6 | `make lint` 增加 `aiohttp` import 禁止规则 | `pyproject.toml` 或 pre-commit | import aiohttp 报 error |
+
+**验收标准**：
+- [ ] `pip list | grep aiohttp` 无输出（或确认仅作为间接依赖）
+- [ ] `python -m app.main` 启动后，关闭时无 `unclosed client session` 警告
+- [ ] 所有 `AsyncClient` 创建都在 lifespan 中
+
+### Phase 4：Gateway 骨架 + Adapter 包装层（1 天）
+
+| # | 任务 | 输出文件 | 检查方式 |
+|---|------|---------|---------|
+| 4.1 | `PlatformGateway` 骨架（`call_sync` / `submit_task` / `query_task` / `handle_webhook`） | `app/platform_gateway.py` | 类方法签名完整 |
+| 4.2 | `VolcengineArkAdapter` 改造现有 Provider 实现 Protocol | `app/ai/adapters/volcengine_ark.py` | 单元测试通过 |
+| 4.3 | `ViduAdapter` 改造现有 Provider 实现 Protocol | `app/ai/adapters/vidu.py` | 单元测试通过 |
+| 4.4 | `VolcengineCaptionAdapter` 改造现有 Service 实现 Protocol | `app/ai/adapters/volcengine_caption.py` | 单元测试通过 |
+| 4.5 | `LLMGateway` 实现（模型选择、Fallback、流式路由） | `app/ai/gateways/llm_gateway.py` | 脚本生成功能测试通过 |
+| 4.6 | lifespan 中初始化所有 Adapter 并注册到 Gateway | `app/main.py` | 启动日志显示各平台初始化成功 |
+
+**验收标准**：
+- [ ] 新增一个 `MockAdapter` 实现 Protocol，5 分钟内完成注册并可用
+- [ ] `LLMGateway.chat()` 主模型失败时自动 Fallback 到备用模型
+- [ ] 健康检查 `/system/platform-health` 返回所有平台状态
+
+### Phase 5：异步任务统一（2 天，数据库方案确定后实施）
+
+| # | 任务 | 输出文件 | 检查方式 |
+|---|------|---------|---------|
+| 5.1 | SQLAlchemy `Job` model（独立设计） | `app/models/job.py` | Alembic 迁移成功 |
+| 5.2 | Pydantic `JobResponse` Schema | `app/schemas/job.py` | 覆盖所有字段 |
+| 5.3 | `JobRegistry` 改为先写数据库、再写 Redis | `app/scheduler/registry.py` | 数据库有数据 |
+| 5.4 | `JobStatus` 扩展为 6 种状态 | `app/schemas/enums.py` | 覆盖所有场景 |
+| 5.5 | `ViduHandler` 接入 Async Engine | `app/scheduler/handlers/vidu_handler.py` | 对口型任务走 Engine |
+| 5.6 | `SubtitleHandler` 改为通过 Gateway 调用 | `app/scheduler/handlers/subtitle_handler.py` | 字幕任务走 Gateway |
+| 5.7 | 统一回调入口 `/webhooks/{platform}` | `app/api/v1/webhooks.py` | Vidu 回调正常 |
+| 5.8 | 删除 Router 中私设 Redis key 的代码 | `app/api/v1/vidu.py` | 无 `vidu:lipsync:` 字样 |
+| 5.9 | 统一任务 API `/jobs/{job_id}` | `app/api/v1/jobs.py` | GET 返回标准格式 |
+| 5.10 | 脚本生成从 SSE 改为异步任务 | `app/api/v1/script.py` / `app/services/script_service.py` | POST /jobs 提交，轮询 /jobs/{id} |
+| 5.11 | 删除 `/script/generate/stream` SSE 端点 | `app/api/v1/script.py` | 端点不存在 |
+
+**验收标准**：
+- [ ] Vidu 对口型任务提交后，Redis 中只有 `job:{uuid}` 格式的 key
+- [ ] 应用重启后，从数据库恢复 running 任务继续执行
+- [ ] 前端轮询 `/jobs/{id}` 获取所有异步任务状态
+
+### Phase 6：清理与验证（0.5 天）
+
+| # | 任务 | 输出文件 | 检查方式 |
+|---|------|---------|---------|
+| 6.1 | `importlinter` 配置（禁止 Router 直接 import Provider） | `.importlinter` | CI 中运行通过 |
+| 6.2 | 删除废弃的 `ai_models.yaml`（确认合并完成后） | — | 文件不存在 |
+| 6.3 | 删除 `ViduService` / `VolcengineCaptionService` 中的重复异常处理 | 对应文件 | 代码审查 |
+| 6.4 | 全量回归测试（所有现有 API 调用一遍） | — | 测试脚本通过 |
+| 6.5 | 更新本文档，标记各阶段完成状态 | 本文档 | 所有 checkbox 打勾 |
+
+---
+
+## 四、检查清单汇总
+
+### 4.1 新增文件清单
+
+| 文件路径 | 说明 | 所属阶段 |
+|---------|------|---------|
+| `app/ai/adapters/__init__.py` | Adapter 包 | Phase 0 |
+| `app/ai/adapters/base.py` | Protocol + dataclass | Phase 2 |
+| `app/ai/adapters/constants.py` | Method 常量 | Phase 2 |
+| `app/ai/adapters/volcengine_ark.py` | 火山方舟 Adapter | Phase 4 |
+| `app/ai/adapters/vidu.py` | Vidu Adapter | Phase 4 |
+| `app/ai/adapters/volcengine_caption.py` | 火山字幕 Adapter | Phase 4 |
+| `app/ai/gateways/llm_gateway.py` | LLM 网关 | Phase 4 |
+| `app/platform_gateway.py` | 统一平台网关 | Phase 0/4 |
+| `app/core/runtime_config.py` | 运行时配置 + 热重载 | Phase 2 |
+| `config/platform-config.yaml` | 合并后的平台配置 | Phase 2 |
+| `app/models/job.py` | 异步任务数据库模型 | Phase 5 |
+| `app/api/v1/jobs.py` | 统一任务 API | Phase 5 |
+| `app/api/v1/webhooks.py` | 统一回调入口 | Phase 5 |
+| `app/scheduler/handlers/vidu_handler.py` | Vidu 任务处理器 | Phase 5 |
+| `.importlinter` | 架构约束配置 | Phase 6 |
+
+### 4.2 修改文件清单
+
+| 文件路径 | 修改内容 | 所属阶段 |
+|---------|---------|---------|
+| `app/core/exceptions.py` | 新增 `PlatformError` / `PlatformErrorType` | Phase 1 |
+| `app/main.py` | 注册异常中间件、lifespan Client 管理 | Phase 1/3/4 |
+| `app/config.py` | 清理非敏感配置，只保留密钥 | Phase 2 |
+| `app/ai/providers/vidu_provider.py` | aiohttp → httpx | Phase 3 |
+| `app/services/volcengine_caption_service.py` | 删除懒加载，改为注入 Client | Phase 3 |
+| `app/api/v1/voice.py` | 临时 Client → 共享 Client | Phase 3 |
+| `app/api/v1/script.py` | SSE → 异步任务 + 删除 stream 端点 | Phase 5 |
+| `app/services/script_service.py` | 删除 generate_script_stream | Phase 5 |
+| `app/api/v1/system.py` | 新增 Admin API | Phase 2 |
+| `app/scheduler/registry.py` | 先写数据库再写 Redis | Phase 5 |
+| `app/scheduler/handlers/subtitle_handler.py` | 通过 Gateway 调用 | Phase 5 |
+| `app/api/v1/vidu.py` | 删除私设 Redis key | Phase 5 |
+| `app/schemas/enums.py` | 扩展 `JobStatus` | Phase 5 |
+| `Makefile` / `pyproject.toml` | lint 规则 | Phase 1/3/6 |
+
+### 4.3 废弃文件清单
+
+| 文件路径 | 废弃原因 | 处理时间 |
+|---------|---------|---------|
+| `config/ai_models.yaml` | 合并到 `platform-config.yaml` | Phase 6 |
+
+---
+
+## 五、风险项与应对
+
+| 风险 | 影响 | 概率 | 应对 |
+|-----|------|------|------|
+| `aiohttp` 迁移到 `httpx` 导致 Vidu 某些边缘场景行为不一致 | 功能回归 | 中 | 迁移后全量测试 Vidu TTS/对口型/克隆 |
+| `PlatformError` 未覆盖所有异常路径，仍有裸 Exception 漏出 | 前端收到 500 无法处理 | 低 | `make lint-semantic` 强制检查 + Code Review |
+| 配置热重载导致运行时行为突变 | 线上限流突然变更 | 低 | Admin API 加操作日志，变更前确认 |
+| Phase 5 数据库改造影响现有 Async Engine | 字幕/脚本任务异常 | 中 | 数据库方案评审后再实施，分步迁移 |
+| 前端轮询改造工作量超预期 | 延期 | 中 | 提前与前端同步接口变更，预留 2 天 |
+
+---
+
+## 六、验收标准（最终 Checklist）
+
+实施全部完成后，按以下清单逐项核对：
+
+- [ ] `PlatformError` 是 `app/services/` 和 `app/ai/` 中唯一的第三方异常类型
+- [ ] Router 中不存在 `except Exception: raise HTTPException(500)` 处理第三方错误
+- [ ] 新增 MockAdapter 实现 Protocol，30 分钟内完成注册并可用
+- [ ] `aiohttp` 不在项目直接依赖中（`pip show aiohttp` 不显示或仅为间接依赖）
+- [ ] 所有 `AsyncClient` 在 lifespan 中创建和销毁
+- [ ] 关闭应用时无 `unclosed client session` 警告
+- [ ] `config/platform-config.yaml` 存在且包含所有平台配置
+- [ ] 修改 `platform-config.yaml` 中的限流参数，10 秒内新请求生效
+- [ ] Admin API `/admin/runtime-config/reload` 手动触发重载成功
+- [ ] 健康检查 `/system/platform-health` 返回所有平台状态
+- [ ] `importlinter` CI 检查通过（Router 不直接 import Provider）
+- [ ] 全量 API 回归测试通过
+
+---
+
+> 本文档为实施的唯一依据。任何偏离文档的变更必须在此文档中登记并说明理由。