1347 lines
46 KiB
Markdown
1347 lines
46 KiB
Markdown
# 文生图与图生图工具 实施计划
|
||
|
||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||
|
||
**Goal:** 为 Neta Agent 新增 `text_to_image` 和 `image_to_image` 两个工具,建立可扩展的图片生成 Provider 层,前端支持图片结果渲染。
|
||
|
||
**Architecture:** 后端新增 `image_providers/` 目录实现 Provider 策略模式(火山引擎 + MiniMax),两个工具通过 `tool_resolver` 注入凭证和 provider 实例。工具返回的图片 URL 经 `ImageStorageService` 转存本地后写入 session tree。前端 `message-item.vue` 升级图片渲染并新增多图网格分支。
|
||
|
||
**Tech Stack:** TypeScript, OpenAI SDK (火山引擎), fetch (MiniMax REST), TypeBox (参数 schema), Vue 3 + Element Plus (前端)
|
||
|
||
**Spec:** `docs/superpowers/specs/2026-05-02-image-generation-tools-design.md`
|
||
|
||
---
|
||
|
||
## File Map
|
||
|
||
| 文件 | 职责 | 操作 |
|
||
|------|------|------|
|
||
| `packages/backend/src/modules/netaclaw/image_providers/types.ts` | Provider 统一接口、凭证、错误类型、工厂函数 | 新增 |
|
||
| `packages/backend/src/modules/netaclaw/image_providers/ark.ts` | 火山引擎 Provider | 新增 |
|
||
| `packages/backend/src/modules/netaclaw/image_providers/minimax.ts` | MiniMax Provider | 新增 |
|
||
| `packages/backend/src/modules/netaclaw/tools/common.ts` | 新增 `ImageItem`、`images` 类型、`imagesResult()`、`toolResultToText` 扩展 | 修改 |
|
||
| `packages/backend/src/modules/netaclaw/tools/manifest.ts` | `ToolGovernanceExtra` 新增 `imageDefaults` / `imageConstraints` | 修改 |
|
||
| `packages/backend/src/modules/netaclaw/tools/builtin/image_common.ts` | 图片工具共享辅助函数(clampDimension、persistImages、formatImageToolResult) | 新增 |
|
||
| `packages/backend/src/modules/netaclaw/tools/builtin/text_to_image.ts` | 文生图工具 | 新增 |
|
||
| `packages/backend/src/modules/netaclaw/tools/builtin/image_to_image.ts` | 图生图工具 | 新增 |
|
||
| `packages/backend/src/modules/netaclaw/tools/catalog.ts` | import 两个新工具触发注册 | 修改 |
|
||
| `packages/backend/src/modules/netaclaw/service/image_storage.ts` | 图片转存本地服务 | 新增 |
|
||
| `packages/backend/src/modules/netaclaw/service/tool_registry.ts` | 新增 `getToolByName` 方法 | 修改 |
|
||
| `packages/backend/src/modules/netaclaw/service/tool_resolver.ts` | 注入凭证和 provider 实例 | 修改 |
|
||
| `packages/backend/src/modules/netaclaw/runtime/prompt_builder.ts` | 附件提示语扩展 | 修改 |
|
||
| `packages/backend/src/modules/netaclaw/runtime/agent.ts` | 调用 `buildLLMMessages` 传 `toolNames` | 修改 |
|
||
| `packages/frontend/src/modules/agent/tools/renderer-registry.ts` | rawResult 新增 `images` 类型 | 修改 |
|
||
| `packages/frontend/src/modules/agent/components/message-item.vue` | 单图升级为 el-image + 多图网格 | 修改 |
|
||
| `packages/frontend/src/modules/agent/views/tools.vue` | 图片生成配置区块 | 修改 |
|
||
|
||
---
|
||
|
||
### Task 1: 后端类型基础 — ToolResultContent 多图扩展
|
||
|
||
**Files:**
|
||
- Modify: `packages/backend/src/modules/netaclaw/tools/common.ts`
|
||
|
||
- [ ] **Step 1: 在 ToolResultContent 类型前新增 ImageItem 接口**
|
||
|
||
在 `common.ts` 的 `ToolResultContent` 类型定义之前(约第 26 行),新增:
|
||
|
||
```typescript
|
||
export interface ImageItem {
|
||
url: string;
|
||
mimeType?: string;
|
||
width?: number;
|
||
height?: number;
|
||
seed?: number;
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: 扩展 ToolResultContent 联合类型**
|
||
|
||
在现有 `type: 'image'` 分支之后,新增 `images` 分支:
|
||
|
||
```typescript
|
||
| { type: 'images'; images: ImageItem[]; text?: string };
|
||
```
|
||
|
||
完整类型变为:
|
||
|
||
```typescript
|
||
export type ToolResultContent =
|
||
| { type: 'text'; text: string }
|
||
| { type: 'json'; data: unknown }
|
||
| {
|
||
type: 'image';
|
||
url: string;
|
||
mimeType?: string;
|
||
text?: string;
|
||
width?: number;
|
||
height?: number;
|
||
bytes?: number;
|
||
originalWidth?: number;
|
||
originalHeight?: number;
|
||
originalBytes?: number;
|
||
resized?: boolean;
|
||
}
|
||
| { type: 'images'; images: ImageItem[]; text?: string };
|
||
```
|
||
|
||
- [ ] **Step 3: 新增 imagesResult 辅助函数**
|
||
|
||
在 `imageResult()` 函数之后新增:
|
||
|
||
```typescript
|
||
export function imagesResult(
|
||
images: ImageItem[],
|
||
text?: string,
|
||
): ToolResultContent {
|
||
return { type: 'images', images, text };
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: 扩展 toolResultToText 函数**
|
||
|
||
在 `toolResultToText` 函数中,`if (value.type === 'image')` 分支之后,新增:
|
||
|
||
```typescript
|
||
if (value.type === 'images') {
|
||
const lines = (value as { type: 'images'; images: ImageItem[]; text?: string }).images.map((img, i) =>
|
||
`[图${i + 1}] ${img.url}${img.width && img.height ? ` (${img.width}x${img.height})` : ''}`
|
||
);
|
||
const header = (value as any).text || `已生成 ${lines.length} 张图片`;
|
||
return `${header}\n${lines.join('\n')}`;
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 5: 验证编译**
|
||
|
||
Run: `cd packages/backend && npx tsc --noEmit --pretty 2>&1 | head -20`
|
||
Expected: 无新增错误
|
||
|
||
- [ ] **Step 6: 提交**
|
||
|
||
```bash
|
||
git add packages/backend/src/modules/netaclaw/tools/common.ts
|
||
git commit -m "feat(netaclaw): extend ToolResultContent with images type for multi-image tool results"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 2: ToolGovernanceExtra 扩展 — imageDefaults / imageConstraints
|
||
|
||
**Files:**
|
||
- Modify: `packages/backend/src/modules/netaclaw/tools/manifest.ts`
|
||
|
||
- [ ] **Step 1: 扩展 ToolGovernanceExtra 类型**
|
||
|
||
在 `manifest.ts` 第 9-12 行的 `ToolGovernanceExtra` 类型中,新增图片工具字段:
|
||
|
||
```typescript
|
||
export type ToolGovernanceExtra = {
|
||
allowInSubagent?: boolean;
|
||
workerRoutingStrategy?: ToolWorkerRoutingStrategy;
|
||
imageDefaults?: {
|
||
n?: number;
|
||
aspectRatio?: string;
|
||
width?: number;
|
||
height?: number;
|
||
watermark?: boolean;
|
||
responseFormat?: 'url' | 'base64';
|
||
};
|
||
imageConstraints?: {
|
||
maxN?: number;
|
||
maxWidth?: number;
|
||
maxHeight?: number;
|
||
};
|
||
};
|
||
```
|
||
|
||
- [ ] **Step 2: 验证编译**
|
||
|
||
Run: `cd packages/backend && npx tsc --noEmit --pretty 2>&1 | head -20`
|
||
Expected: 无新增错误(现有代码只读取 `allowInSubagent` 和 `workerRoutingStrategy`,新增字段不影响)
|
||
|
||
- [ ] **Step 3: 提交**
|
||
|
||
```bash
|
||
git add packages/backend/src/modules/netaclaw/tools/manifest.ts
|
||
git commit -m "feat(netaclaw): add imageDefaults and imageConstraints to ToolGovernanceExtra"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 3: Provider 层 — 统一接口与工厂
|
||
|
||
**Files:**
|
||
- Create: `packages/backend/src/modules/netaclaw/image_providers/types.ts`
|
||
|
||
- [ ] **Step 1: 创建 image_providers 目录和 types.ts**
|
||
|
||
```typescript
|
||
import type { ToolGovernanceExtra } from '../tools/manifest.js';
|
||
|
||
export interface ImageProviderCredentials {
|
||
baseUrl: string;
|
||
apiKey: string;
|
||
supplier: string;
|
||
modelId: string;
|
||
promptHint: string | null;
|
||
extra?: ToolGovernanceExtra | null;
|
||
}
|
||
|
||
export interface TextToImageParams {
|
||
prompt: string;
|
||
width?: number;
|
||
height?: number;
|
||
aspectRatio?: string;
|
||
n?: number;
|
||
responseFormat?: 'url' | 'base64';
|
||
watermark?: boolean;
|
||
seed?: number;
|
||
extra?: Record<string, unknown>;
|
||
}
|
||
|
||
export interface ImageToImageParams extends TextToImageParams {
|
||
referenceImage: string;
|
||
strength?: number;
|
||
}
|
||
|
||
export interface ImageGenerationResult {
|
||
images: { url?: string; base64?: string; width?: number; height?: number }[];
|
||
model: string;
|
||
provider: string;
|
||
}
|
||
|
||
export interface ImageGenerationProvider {
|
||
readonly id: string;
|
||
textToImage(params: TextToImageParams, creds: ImageProviderCredentials): Promise<ImageGenerationResult>;
|
||
imageToImage(params: ImageToImageParams, creds: ImageProviderCredentials): Promise<ImageGenerationResult>;
|
||
}
|
||
|
||
export type ImageGenerationErrorCode =
|
||
| 'content_safety'
|
||
| 'rate_limit'
|
||
| 'insufficient_balance'
|
||
| 'invalid_params'
|
||
| 'timeout'
|
||
| 'network'
|
||
| 'unknown';
|
||
|
||
export class ImageGenerationError extends Error {
|
||
constructor(
|
||
message: string,
|
||
public readonly code: ImageGenerationErrorCode,
|
||
public readonly retryable: boolean,
|
||
) {
|
||
super(message);
|
||
this.name = 'ImageGenerationError';
|
||
}
|
||
}
|
||
|
||
const providers = new Map<string, ImageGenerationProvider>();
|
||
|
||
export function registerImageProvider(provider: ImageGenerationProvider): void {
|
||
providers.set(provider.id, provider);
|
||
}
|
||
|
||
export function getImageProvider(supplier: string, baseUrl: string): ImageGenerationProvider | null {
|
||
const s = supplier.toLowerCase();
|
||
if (s === 'minimax') return providers.get('minimax') ?? null;
|
||
if (s === 'ark' || s === 'volcengine') return providers.get('ark') ?? null;
|
||
if (s === 'openai') {
|
||
if (baseUrl.includes('volces.com') || baseUrl.includes('volcengine')) return providers.get('ark') ?? null;
|
||
if (baseUrl.includes('minimax')) return providers.get('minimax') ?? null;
|
||
}
|
||
return null;
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: 验证编译**
|
||
|
||
Run: `cd packages/backend && npx tsc --noEmit --pretty 2>&1 | head -20`
|
||
Expected: 无错误
|
||
|
||
- [ ] **Step 3: 提交**
|
||
|
||
```bash
|
||
git add packages/backend/src/modules/netaclaw/image_providers/
|
||
git commit -m "feat(netaclaw): add image provider types, error class, and factory"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 4: Provider 层 — 火山引擎 (Ark)
|
||
|
||
**Files:**
|
||
- Create: `packages/backend/src/modules/netaclaw/image_providers/ark.ts`
|
||
|
||
- [ ] **Step 1: 实现 ArkImageProvider**
|
||
|
||
```typescript
|
||
import OpenAI from 'openai';
|
||
import {
|
||
type ImageGenerationProvider,
|
||
type ImageProviderCredentials,
|
||
type TextToImageParams,
|
||
type ImageToImageParams,
|
||
type ImageGenerationResult,
|
||
ImageGenerationError,
|
||
registerImageProvider,
|
||
} from './types.js';
|
||
|
||
function resolveSize(params: TextToImageParams): string | undefined {
|
||
if (params.aspectRatio) {
|
||
const map: Record<string, string> = {
|
||
'1:1': '1024x1024',
|
||
'16:9': '1280x720',
|
||
'4:3': '1152x864',
|
||
'3:2': '1248x832',
|
||
'2:3': '832x1248',
|
||
'3:4': '864x1152',
|
||
'9:16': '720x1280',
|
||
};
|
||
return map[params.aspectRatio] ?? '1024x1024';
|
||
}
|
||
if (params.width && params.height) {
|
||
return `${params.width}x${params.height}`;
|
||
}
|
||
return undefined;
|
||
}
|
||
|
||
function normalizeResult(response: OpenAI.Images.ImagesResponse, creds: ImageProviderCredentials): ImageGenerationResult {
|
||
return {
|
||
images: (response.data ?? []).map(item => ({
|
||
url: item.url,
|
||
base64: item.b64_json,
|
||
})),
|
||
model: creds.modelId,
|
||
provider: 'ark',
|
||
};
|
||
}
|
||
|
||
class ArkImageProvider implements ImageGenerationProvider {
|
||
readonly id = 'ark';
|
||
|
||
async textToImage(params: TextToImageParams, creds: ImageProviderCredentials): Promise<ImageGenerationResult> {
|
||
const client = new OpenAI({ apiKey: creds.apiKey, baseURL: creds.baseUrl, timeout: 60_000 });
|
||
try {
|
||
const response = await client.images.generate({
|
||
model: creds.modelId,
|
||
prompt: params.prompt,
|
||
size: resolveSize(params) as any,
|
||
n: params.n ?? 1,
|
||
response_format: params.responseFormat ?? 'url',
|
||
...(params.extra || params.watermark !== undefined
|
||
? {
|
||
extra_body: {
|
||
...(params.watermark !== undefined ? { watermark: params.watermark } : {}),
|
||
...params.extra,
|
||
},
|
||
}
|
||
: {}),
|
||
});
|
||
return normalizeResult(response, creds);
|
||
} catch (err: any) {
|
||
throw this.wrapError(err);
|
||
}
|
||
}
|
||
|
||
async imageToImage(params: ImageToImageParams, creds: ImageProviderCredentials): Promise<ImageGenerationResult> {
|
||
const client = new OpenAI({ apiKey: creds.apiKey, baseURL: creds.baseUrl, timeout: 60_000 });
|
||
try {
|
||
const response = await client.images.generate({
|
||
model: creds.modelId,
|
||
prompt: params.prompt,
|
||
size: resolveSize(params) as any,
|
||
n: params.n ?? 1,
|
||
response_format: params.responseFormat ?? 'url',
|
||
extra_body: {
|
||
image: params.referenceImage,
|
||
...(params.strength !== undefined ? { strength: params.strength } : {}),
|
||
...(params.watermark !== undefined ? { watermark: params.watermark } : {}),
|
||
...params.extra,
|
||
},
|
||
} as any);
|
||
return normalizeResult(response, creds);
|
||
} catch (err: any) {
|
||
throw this.wrapError(err);
|
||
}
|
||
}
|
||
|
||
private wrapError(err: any): ImageGenerationError {
|
||
const status = err?.status ?? err?.response?.status;
|
||
const msg = err?.message ?? String(err);
|
||
if (status === 429) return new ImageGenerationError(msg, 'rate_limit', true);
|
||
if (status === 402) return new ImageGenerationError(msg, 'insufficient_balance', false);
|
||
if (status === 400) {
|
||
if (msg.includes('safety') || msg.includes('sensitive') || msg.includes('安全'))
|
||
return new ImageGenerationError('提示词触发内容安全策略,请调整描述', 'content_safety', false);
|
||
return new ImageGenerationError(msg, 'invalid_params', false);
|
||
}
|
||
if (err?.code === 'ETIMEDOUT' || err?.code === 'ECONNABORTED')
|
||
return new ImageGenerationError('生成超时,可尝试降低图片尺寸', 'timeout', true);
|
||
return new ImageGenerationError(msg, 'unknown', false);
|
||
}
|
||
}
|
||
|
||
registerImageProvider(new ArkImageProvider());
|
||
```
|
||
|
||
- [ ] **Step 2: 验证编译**
|
||
|
||
Run: `cd packages/backend && npx tsc --noEmit --pretty 2>&1 | head -20`
|
||
Expected: 无错误
|
||
|
||
- [ ] **Step 3: 提交**
|
||
|
||
```bash
|
||
git add packages/backend/src/modules/netaclaw/image_providers/ark.ts
|
||
git commit -m "feat(netaclaw): add Ark (Volcano Engine) image provider via OpenAI SDK"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 5: Provider 层 — MiniMax
|
||
|
||
**Files:**
|
||
- Create: `packages/backend/src/modules/netaclaw/image_providers/minimax.ts`
|
||
|
||
- [ ] **Step 1: 实现 MiniMaxImageProvider**
|
||
|
||
```typescript
|
||
import {
|
||
type ImageGenerationProvider,
|
||
type ImageProviderCredentials,
|
||
type TextToImageParams,
|
||
type ImageToImageParams,
|
||
type ImageGenerationResult,
|
||
ImageGenerationError,
|
||
registerImageProvider,
|
||
} from './types.js';
|
||
|
||
interface MiniMaxResponse {
|
||
id?: string;
|
||
data?: { image_urls?: string[]; image_base64?: string[] };
|
||
metadata?: { success_count?: number; failed_count?: number };
|
||
base_resp?: { status_code?: number; status_msg?: string };
|
||
}
|
||
|
||
function buildBaseBody(params: TextToImageParams, creds: ImageProviderCredentials): Record<string, unknown> {
|
||
const body: Record<string, unknown> = {
|
||
model: creds.modelId,
|
||
prompt: params.prompt,
|
||
n: params.n ?? 1,
|
||
response_format: params.responseFormat ?? 'url',
|
||
};
|
||
if (params.watermark !== undefined) body.aigc_watermark = params.watermark;
|
||
if (params.aspectRatio) {
|
||
body.aspect_ratio = params.aspectRatio;
|
||
} else if (params.width && params.height) {
|
||
body.width = params.width;
|
||
body.height = params.height;
|
||
}
|
||
if (params.seed !== undefined) body.seed = params.seed;
|
||
if (params.extra?.style) body.style = params.extra.style;
|
||
if (params.extra?.prompt_optimizer !== undefined) body.prompt_optimizer = params.extra.prompt_optimizer;
|
||
return body;
|
||
}
|
||
|
||
function normalizeResponse(json: MiniMaxResponse, creds: ImageProviderCredentials, format: string): ImageGenerationResult {
|
||
const resp = json.base_resp;
|
||
if (resp && resp.status_code !== 0) {
|
||
const code = resp.status_code;
|
||
if (code === 1026) throw new ImageGenerationError('提示词触发内容安全策略,请调整描述', 'content_safety', false);
|
||
if (code === 1002) throw new ImageGenerationError('当前请求过多,请稍后重试', 'rate_limit', true);
|
||
if (code === 1008) throw new ImageGenerationError('模型渠道余额不足', 'insufficient_balance', false);
|
||
if (code === 1004) throw new ImageGenerationError('API Key 鉴权失败', 'invalid_params', false);
|
||
throw new ImageGenerationError(resp.status_msg ?? `MiniMax error ${code}`, 'unknown', false);
|
||
}
|
||
const urls = json.data?.image_urls ?? [];
|
||
const b64s = json.data?.image_base64 ?? [];
|
||
const images = format === 'base64'
|
||
? b64s.map(b => ({ base64: b }))
|
||
: urls.map(u => ({ url: u }));
|
||
return { images, model: creds.modelId, provider: 'minimax' };
|
||
}
|
||
|
||
class MiniMaxImageProvider implements ImageGenerationProvider {
|
||
readonly id = 'minimax';
|
||
|
||
async textToImage(params: TextToImageParams, creds: ImageProviderCredentials): Promise<ImageGenerationResult> {
|
||
const body = buildBaseBody(params, creds);
|
||
return this.request(body, creds, params.responseFormat ?? 'url');
|
||
}
|
||
|
||
async imageToImage(params: ImageToImageParams, creds: ImageProviderCredentials): Promise<ImageGenerationResult> {
|
||
const body = buildBaseBody(params, creds);
|
||
body.subject_reference = [{ image: params.referenceImage }];
|
||
if (params.strength !== undefined) body.strength = params.strength;
|
||
return this.request(body, creds, params.responseFormat ?? 'url');
|
||
}
|
||
|
||
private async request(body: Record<string, unknown>, creds: ImageProviderCredentials, format: string): Promise<ImageGenerationResult> {
|
||
const controller = new AbortController();
|
||
const timer = setTimeout(() => controller.abort(), 60_000);
|
||
try {
|
||
const baseUrl = creds.baseUrl.replace(/\/+$/, '');
|
||
const res = await fetch(`${baseUrl}/v1/image_generation`, {
|
||
method: 'POST',
|
||
headers: {
|
||
'Content-Type': 'application/json',
|
||
'Authorization': `Bearer ${creds.apiKey}`,
|
||
},
|
||
body: JSON.stringify(body),
|
||
signal: controller.signal,
|
||
});
|
||
if (!res.ok) {
|
||
const text = await res.text().catch(() => '');
|
||
throw new ImageGenerationError(`MiniMax HTTP ${res.status}: ${text}`, res.status === 429 ? 'rate_limit' : 'unknown', res.status === 429);
|
||
}
|
||
const json: MiniMaxResponse = await res.json();
|
||
return normalizeResponse(json, creds, format);
|
||
} catch (err: any) {
|
||
if (err instanceof ImageGenerationError) throw err;
|
||
if (err?.name === 'AbortError') throw new ImageGenerationError('生成超时,可尝试降低图片尺寸', 'timeout', true);
|
||
throw new ImageGenerationError(err?.message ?? String(err), 'network', true);
|
||
} finally {
|
||
clearTimeout(timer);
|
||
}
|
||
}
|
||
}
|
||
|
||
registerImageProvider(new MiniMaxImageProvider());
|
||
```
|
||
|
||
- [ ] **Step 2: 验证编译**
|
||
|
||
Run: `cd packages/backend && npx tsc --noEmit --pretty 2>&1 | head -20`
|
||
Expected: 无错误
|
||
|
||
- [ ] **Step 3: 提交**
|
||
|
||
```bash
|
||
git add packages/backend/src/modules/netaclaw/image_providers/minimax.ts
|
||
git commit -m "feat(netaclaw): add MiniMax image provider via REST API"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 6: 图片转存服务
|
||
|
||
**Files:**
|
||
- Create: `packages/backend/src/modules/netaclaw/service/image_storage.ts`
|
||
|
||
- [ ] **Step 1: 实现 ImageStorageService**
|
||
|
||
复用现有 `pluginService.getInstance('upload')` 的 `downAndUpload` 方法:
|
||
|
||
```typescript
|
||
import { Inject, Provide, Scope, ScopeEnum } from '@midwayjs/core';
|
||
import { PluginService } from '../../plugin/service/info.js';
|
||
import { randomUUID } from 'crypto';
|
||
|
||
@Provide()
|
||
@Scope(ScopeEnum.Singleton)
|
||
export class ImageStorageService {
|
||
@Inject()
|
||
pluginService: PluginService;
|
||
|
||
async persist(tempUrl: string): Promise<string> {
|
||
const upload = await this.pluginService.getInstance('upload');
|
||
const ext = this.detectExtension(tempUrl);
|
||
const filename = `img-${Date.now()}-${randomUUID().slice(0, 8)}${ext}`;
|
||
return upload.downAndUpload(tempUrl, filename);
|
||
}
|
||
|
||
async persistAll(urls: string[]): Promise<string[]> {
|
||
return Promise.all(urls.map(url => this.persist(url)));
|
||
}
|
||
|
||
private detectExtension(url: string): string {
|
||
const pathname = url.split('?')[0];
|
||
const match = pathname.match(/\.(png|jpg|jpeg|webp|gif)$/i);
|
||
return match ? `.${match[1].toLowerCase()}` : '.png';
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: 验证编译**
|
||
|
||
Run: `cd packages/backend && npx tsc --noEmit --pretty 2>&1 | head -20`
|
||
Expected: 无错误
|
||
|
||
- [ ] **Step 3: 提交**
|
||
|
||
```bash
|
||
git add packages/backend/src/modules/netaclaw/service/image_storage.ts
|
||
git commit -m "feat(netaclaw): add ImageStorageService for persisting generated images"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 7: 图片工具共享函数
|
||
|
||
**Files:**
|
||
- Create: `packages/backend/src/modules/netaclaw/tools/builtin/image_common.ts`
|
||
|
||
- [ ] **Step 1: 创建 image_common.ts**
|
||
|
||
提取 text_to_image 和 image_to_image 共用的辅助函数:
|
||
|
||
```typescript
|
||
import { imageResult, imagesResult } from '../common.js';
|
||
import type { ImageGenerationResult } from '../../image_providers/types.js';
|
||
import type { ImageStorageService } from '../../service/image_storage.js';
|
||
|
||
export function clampDimension(value: number | undefined, max: number): number | undefined {
|
||
if (value === undefined) return undefined;
|
||
return Math.min(value, max);
|
||
}
|
||
|
||
export async function persistImages(result: ImageGenerationResult, storage: ImageStorageService): Promise<ImageGenerationResult> {
|
||
const persisted = await Promise.all(
|
||
result.images.map(async img => {
|
||
if (!img.url) return img;
|
||
const permanentUrl = await storage.persist(img.url);
|
||
return { ...img, url: permanentUrl };
|
||
})
|
||
);
|
||
return { ...result, images: persisted };
|
||
}
|
||
|
||
export function formatImageToolResult(result: ImageGenerationResult) {
|
||
if (result.images.length === 1) {
|
||
const img = result.images[0];
|
||
return imageResult(img.url!, undefined, {
|
||
width: img.width,
|
||
height: img.height,
|
||
text: `图片已生成 (${result.provider}/${result.model})`,
|
||
});
|
||
}
|
||
return imagesResult(
|
||
result.images.map(img => ({ url: img.url!, width: img.width, height: img.height })),
|
||
`已生成 ${result.images.length} 张图片 (${result.provider}/${result.model})`,
|
||
);
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: 验证编译**
|
||
|
||
Run: `cd packages/backend && npx tsc --noEmit --pretty 2>&1 | head -20`
|
||
Expected: 无错误
|
||
|
||
- [ ] **Step 3: 提交**
|
||
|
||
```bash
|
||
git add packages/backend/src/modules/netaclaw/tools/builtin/image_common.ts
|
||
git commit -m "feat(netaclaw): extract shared image tool helpers"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 8: text_to_image 工具
|
||
|
||
**Files:**
|
||
- Create: `packages/backend/src/modules/netaclaw/tools/builtin/text_to_image.ts`
|
||
- Modify: `packages/backend/src/modules/netaclaw/tools/catalog.ts`
|
||
|
||
- [ ] **Step 1: 创建 text_to_image.ts**
|
||
|
||
```typescript
|
||
import { Type, Static } from '@sinclair/typebox';
|
||
import { type AnyAgentTool } from '../common.js';
|
||
import { registerSchema } from '../catalog.js';
|
||
import type {
|
||
ImageGenerationProvider,
|
||
ImageProviderCredentials,
|
||
} from '../../image_providers/types.js';
|
||
import { ImageGenerationError } from '../../image_providers/types.js';
|
||
import type { ImageStorageService } from '../../service/image_storage.js';
|
||
import { clampDimension, persistImages, formatImageToolResult } from './image_common.js';
|
||
|
||
const Params = Type.Object({
|
||
prompt: Type.String({ description: '图片描述,尽量详细具体' }),
|
||
aspectRatio: Type.Optional(Type.String({
|
||
description: '宽高比。可选: 1:1, 16:9, 4:3, 3:2, 2:3, 3:4, 9:16',
|
||
})),
|
||
width: Type.Optional(Type.Integer({ description: '精确宽度(像素),优先级低于 aspectRatio' })),
|
||
height: Type.Optional(Type.Integer({ description: '精确高度(像素)' })),
|
||
n: Type.Optional(Type.Integer({ description: '生成数量,默认 1,最大 9', minimum: 1, maximum: 9 })),
|
||
watermark: Type.Optional(Type.Boolean({ description: '是否添加水印' })),
|
||
seed: Type.Optional(Type.Integer({ description: '随机种子,相同 seed 可复现相近结果' })),
|
||
extra: Type.Optional(Type.Record(Type.String(), Type.Unknown(), {
|
||
description: 'Provider 特有参数,如 MiniMax 的 style、prompt_optimizer',
|
||
})),
|
||
});
|
||
|
||
export function createTextToImageTool(
|
||
creds: ImageProviderCredentials,
|
||
provider: ImageGenerationProvider,
|
||
storage: ImageStorageService,
|
||
): AnyAgentTool {
|
||
const defaults = creds.extra?.imageDefaults ?? {};
|
||
const constraints = creds.extra?.imageConstraints ?? {};
|
||
|
||
return {
|
||
name: 'text_to_image',
|
||
label: '文生图',
|
||
description: creds.promptHint
|
||
? `根据文字描述生成图片。\n${creds.promptHint}`
|
||
: '根据文字描述生成图片,支持指定尺寸、数量、风格等参数。',
|
||
parameters: Params,
|
||
async execute(_id: string, params: Static<typeof Params>) {
|
||
const merged = {
|
||
prompt: params.prompt,
|
||
n: Math.min(params.n ?? defaults.n ?? 1, constraints.maxN ?? 9),
|
||
aspectRatio: params.aspectRatio ?? defaults.aspectRatio,
|
||
width: clampDimension(params.width ?? defaults.width, constraints.maxWidth ?? 2048),
|
||
height: clampDimension(params.height ?? defaults.height, constraints.maxHeight ?? 2048),
|
||
watermark: params.watermark ?? defaults.watermark ?? false,
|
||
seed: params.seed,
|
||
responseFormat: (defaults.responseFormat ?? 'url') as 'url' | 'base64',
|
||
extra: params.extra,
|
||
};
|
||
|
||
try {
|
||
let result = await provider.textToImage(merged, creds);
|
||
result = await persistImages(result, storage);
|
||
return formatImageToolResult(result);
|
||
} catch (err) {
|
||
if (err instanceof ImageGenerationError) {
|
||
const prefix = err.retryable ? '[可重试] ' : '';
|
||
return { type: 'text' as const, text: `${prefix}图片生成失败: ${err.message}` };
|
||
}
|
||
return { type: 'text' as const, text: `图片生成失败: ${err instanceof Error ? err.message : String(err)}` };
|
||
}
|
||
},
|
||
};
|
||
}
|
||
|
||
registerSchema({
|
||
name: 'text_to_image',
|
||
toolset: 'vision',
|
||
description: '根据文字描述生成图片,支持指定尺寸、数量、风格等参数。',
|
||
capability: 'multimodal',
|
||
visibility: 'tool',
|
||
isCore: false,
|
||
canDisable: true,
|
||
supportsPromptHint: true,
|
||
requiresModel: true,
|
||
});
|
||
```
|
||
|
||
- [ ] **Step 2: 在 catalog.ts 注册**
|
||
|
||
在 `catalog.ts` 末尾的 import 列表中(约第 66 行 `import './builtin/execute_skill.js';` 之后),新增:
|
||
|
||
```typescript
|
||
import './builtin/text_to_image.js';
|
||
```
|
||
|
||
- [ ] **Step 3: 验证编译**
|
||
|
||
Run: `cd packages/backend && npx tsc --noEmit --pretty 2>&1 | head -20`
|
||
Expected: 无错误
|
||
|
||
- [ ] **Step 4: 提交**
|
||
|
||
```bash
|
||
git add packages/backend/src/modules/netaclaw/tools/builtin/text_to_image.ts packages/backend/src/modules/netaclaw/tools/catalog.ts
|
||
git commit -m "feat(netaclaw): add text_to_image tool with defaults/constraints merge and image persistence"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 9: image_to_image 工具
|
||
|
||
**Files:**
|
||
- Create: `packages/backend/src/modules/netaclaw/tools/builtin/image_to_image.ts`
|
||
- Modify: `packages/backend/src/modules/netaclaw/tools/catalog.ts`
|
||
|
||
- [ ] **Step 1: 创建 image_to_image.ts**
|
||
|
||
```typescript
|
||
import { Type, Static } from '@sinclair/typebox';
|
||
import { type AnyAgentTool } from '../common.js';
|
||
import { registerSchema } from '../catalog.js';
|
||
import type {
|
||
ImageGenerationProvider,
|
||
ImageProviderCredentials,
|
||
} from '../../image_providers/types.js';
|
||
import { ImageGenerationError } from '../../image_providers/types.js';
|
||
import type { ImageStorageService } from '../../service/image_storage.js';
|
||
import { clampDimension, persistImages, formatImageToolResult } from './image_common.js';
|
||
|
||
const Params = Type.Object({
|
||
prompt: Type.String({ description: '对参考图的修改描述' }),
|
||
referenceImage: Type.String({ description: '参考图片 URL(从用户上传附件获取)' }),
|
||
strength: Type.Optional(Type.Number({
|
||
description: '参考图影响强度 0-1,越大越接近原图', minimum: 0, maximum: 1,
|
||
})),
|
||
aspectRatio: Type.Optional(Type.String({
|
||
description: '宽高比。可选: 1:1, 16:9, 4:3, 3:2, 2:3, 3:4, 9:16',
|
||
})),
|
||
width: Type.Optional(Type.Integer({ description: '精确宽度(像素),优先级低于 aspectRatio' })),
|
||
height: Type.Optional(Type.Integer({ description: '精确高度(像素)' })),
|
||
n: Type.Optional(Type.Integer({ description: '生成数量,默认 1,最大 9', minimum: 1, maximum: 9 })),
|
||
watermark: Type.Optional(Type.Boolean({ description: '是否添加水印' })),
|
||
seed: Type.Optional(Type.Integer({ description: '随机种子' })),
|
||
extra: Type.Optional(Type.Record(Type.String(), Type.Unknown(), {
|
||
description: 'Provider 特有参数',
|
||
})),
|
||
});
|
||
|
||
export function createImageToImageTool(
|
||
creds: ImageProviderCredentials,
|
||
provider: ImageGenerationProvider,
|
||
storage: ImageStorageService,
|
||
): AnyAgentTool {
|
||
const defaults = creds.extra?.imageDefaults ?? {};
|
||
const constraints = creds.extra?.imageConstraints ?? {};
|
||
|
||
return {
|
||
name: 'image_to_image',
|
||
label: '图生图',
|
||
description: creds.promptHint
|
||
? `基于参考图片生成新图片。\n${creds.promptHint}`
|
||
: '基于参考图片生成新图片,支持风格迁移、内容编辑等。传入参考图URL和修改描述。',
|
||
parameters: Params,
|
||
async execute(_id: string, params: Static<typeof Params>) {
|
||
const merged = {
|
||
prompt: params.prompt,
|
||
referenceImage: params.referenceImage,
|
||
strength: params.strength,
|
||
n: Math.min(params.n ?? defaults.n ?? 1, constraints.maxN ?? 9),
|
||
aspectRatio: params.aspectRatio ?? defaults.aspectRatio,
|
||
width: clampDimension(params.width ?? defaults.width, constraints.maxWidth ?? 2048),
|
||
height: clampDimension(params.height ?? defaults.height, constraints.maxHeight ?? 2048),
|
||
watermark: params.watermark ?? defaults.watermark ?? false,
|
||
seed: params.seed,
|
||
responseFormat: (defaults.responseFormat ?? 'url') as 'url' | 'base64',
|
||
extra: params.extra,
|
||
};
|
||
|
||
try {
|
||
let result = await provider.imageToImage(merged, creds);
|
||
result = await persistImages(result, storage);
|
||
return formatImageToolResult(result);
|
||
} catch (err) {
|
||
if (err instanceof ImageGenerationError) {
|
||
const prefix = err.retryable ? '[可重试] ' : '';
|
||
return { type: 'text' as const, text: `${prefix}图片生成失败: ${err.message}` };
|
||
}
|
||
return { type: 'text' as const, text: `图片生成失败: ${err instanceof Error ? err.message : String(err)}` };
|
||
}
|
||
},
|
||
};
|
||
}
|
||
|
||
registerSchema({
|
||
name: 'image_to_image',
|
||
toolset: 'vision',
|
||
description: '基于参考图片生成新图片,支持风格迁移、内容编辑等。传入参考图URL和修改描述。',
|
||
capability: 'multimodal',
|
||
visibility: 'tool',
|
||
isCore: false,
|
||
canDisable: true,
|
||
supportsPromptHint: true,
|
||
requiresModel: true,
|
||
});
|
||
```
|
||
|
||
- [ ] **Step 2: 在 catalog.ts 注册**
|
||
|
||
在 `catalog.ts` 末尾 `import './builtin/text_to_image.js';` 之后新增:
|
||
|
||
```typescript
|
||
import './builtin/image_to_image.js';
|
||
```
|
||
|
||
- [ ] **Step 3: 验证编译**
|
||
|
||
Run: `cd packages/backend && npx tsc --noEmit --pretty 2>&1 | head -20`
|
||
Expected: 无错误
|
||
|
||
- [ ] **Step 4: 提交**
|
||
|
||
```bash
|
||
git add packages/backend/src/modules/netaclaw/tools/builtin/image_to_image.ts packages/backend/src/modules/netaclaw/tools/catalog.ts
|
||
git commit -m "feat(netaclaw): add image_to_image tool with reference image support"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 10: tool_resolver 集成
|
||
|
||
**Files:**
|
||
- Modify: `packages/backend/src/modules/netaclaw/service/tool_resolver.ts`
|
||
- Modify: `packages/backend/src/modules/netaclaw/service/tool_registry.ts`
|
||
|
||
- [ ] **Step 1: 在 tool_registry.ts 新增 getToolByName 方法**
|
||
|
||
在 `NetaClawToolRegistryService` 类中(`getToolModelConfig` 方法附近),新增:
|
||
|
||
```typescript
|
||
async getToolByName(name: string): Promise<NetaClawToolEntity | null> {
|
||
return this.toolRepo.findOneBy({ name });
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: 在 tool_resolver.ts 添加 import**
|
||
|
||
在 `tool_resolver.ts` 顶部 import 区域(约第 32 行 `import { createImageRecognizeTool }` 附近),新增:
|
||
|
||
```typescript
|
||
import { createTextToImageTool } from '../tools/builtin/text_to_image.js';
|
||
import { createImageToImageTool } from '../tools/builtin/image_to_image.js';
|
||
import { getImageProvider, type ImageProviderCredentials } from '../image_providers/types.js';
|
||
import '../image_providers/ark.js';
|
||
import '../image_providers/minimax.js';
|
||
import { ImageStorageService } from './image_storage.js';
|
||
```
|
||
|
||
- [ ] **Step 3: 注入 ImageStorageService**
|
||
|
||
在 `NetaClawToolResolverService` 类中,已有的 `@Inject()` 区域新增:
|
||
|
||
```typescript
|
||
@Inject()
|
||
imageStorageService: ImageStorageService;
|
||
```
|
||
|
||
- [ ] **Step 4: 在 resolve 方法中注入两个工具**
|
||
|
||
在 `tool_resolver.ts` 的 `resolve` 方法中,找到 `image_recognize` 的注入块(约第 647-664 行),在其 `}` 之后新增:
|
||
|
||
```typescript
|
||
for (const imgToolName of ['text_to_image', 'image_to_image'] as const) {
|
||
if (filteredNames.includes(imgToolName)) {
|
||
const toolModelConfig = await this.toolRegistry.getToolModelConfig(imgToolName);
|
||
if (toolModelConfig) {
|
||
const channelCreds = await this.modelChannelService.resolveForAgent(toolModelConfig.modelChannelId, toolModelConfig.modelId);
|
||
if (channelCreds) {
|
||
const provider = getImageProvider(channelCreds.channelSupplier, channelCreds.baseUrl ?? '');
|
||
if (provider) {
|
||
const toolEntity = await this.toolRegistry.getToolByName(imgToolName);
|
||
const extra = toolEntity?.extra as import('../tools/manifest.js').ToolGovernanceExtra | null;
|
||
const creds: ImageProviderCredentials = {
|
||
baseUrl: channelCreds.baseUrl ?? '',
|
||
apiKey: channelCreds.apiKey,
|
||
supplier: channelCreds.channelSupplier,
|
||
modelId: toolModelConfig.modelId,
|
||
promptHint: toolModelConfig.promptHint,
|
||
extra,
|
||
};
|
||
if (imgToolName === 'text_to_image') {
|
||
runtimeTools.push(createTextToImageTool(creds, provider, this.imageStorageService));
|
||
} else {
|
||
runtimeTools.push(createImageToImageTool(creds, provider, this.imageStorageService));
|
||
}
|
||
} else {
|
||
disabledReasons.push({ name: imgToolName, reason: 'image_provider_not_found' });
|
||
}
|
||
} else {
|
||
disabledReasons.push({ name: imgToolName, reason: 'model_channel_unavailable' });
|
||
}
|
||
} else {
|
||
disabledReasons.push({ name: imgToolName, reason: 'model_not_configured' });
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 5: 验证编译**
|
||
|
||
Run: `cd packages/backend && npx tsc --noEmit --pretty 2>&1 | head -20`
|
||
Expected: 无错误
|
||
|
||
- [ ] **Step 6: 提交**
|
||
|
||
```bash
|
||
git add packages/backend/src/modules/netaclaw/service/tool_resolver.ts packages/backend/src/modules/netaclaw/service/tool_registry.ts
|
||
git commit -m "feat(netaclaw): integrate text_to_image and image_to_image into tool_resolver"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 11: Prompt Builder 附件提示语扩展
|
||
|
||
**Files:**
|
||
- Modify: `packages/backend/src/modules/netaclaw/runtime/prompt_builder.ts`
|
||
- Modify: `packages/backend/src/modules/netaclaw/runtime/agent.ts`
|
||
|
||
- [ ] **Step 1: 修改 buildLLMMessages 签名**
|
||
|
||
在 `prompt_builder.ts` 第 141 行,给 `buildLLMMessages` 新增 `toolNames` 参数:
|
||
|
||
```typescript
|
||
export function buildLLMMessages(
|
||
systemPrompt: string,
|
||
history: LLMMessage[],
|
||
userMessage: UserMessageInput,
|
||
toolNames?: string[],
|
||
): LLMMessage[] {
|
||
```
|
||
|
||
- [ ] **Step 2: 修改附件提示语生成逻辑**
|
||
|
||
替换第 152-162 行的附件处理块:
|
||
|
||
```typescript
|
||
if (userMessage.metadata?.attachments && (userMessage.metadata.attachments as unknown[]).length) {
|
||
const attachments = userMessage.metadata.attachments as ChatAttachment[];
|
||
const desc = attachments.map(a => {
|
||
const typeLabel = ({ image: '图片', video: '视频', pdf: 'PDF', document: '文件', other: '文件' } as Record<string, string>)[a.type];
|
||
return `- ${typeLabel}: ${a.name} (URL: ${a.url})`;
|
||
}).join('\n');
|
||
|
||
const hints: string[] = [];
|
||
const names = new Set(toolNames ?? []);
|
||
if (names.has('image_recognize')) {
|
||
hints.push('如需分析图片内容,请使用 image_recognize 工具,传入图片 URL');
|
||
}
|
||
if (names.has('image_to_image')) {
|
||
hints.push('如需基于图片生成新图片,请使用 image_to_image 工具,将图片 URL 作为 referenceImage 参数');
|
||
}
|
||
if (hints.length === 0) {
|
||
hints.push('附件已上传,可在需要时引用其 URL');
|
||
}
|
||
|
||
messages.push({
|
||
role: 'user',
|
||
content: `[系统提示] 用户上传了以下附件:\n${desc}\n${hints.join('。')}。`,
|
||
});
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 3: 修改 agent.ts 调用点**
|
||
|
||
在 `agent.ts` 第 96-100 行,给 `buildLLMMessages` 传入 `toolNames`:
|
||
|
||
```typescript
|
||
const messages: LLMMessage[] = buildLLMMessages(
|
||
agentConfig.systemPrompt,
|
||
history,
|
||
{ content: userMessage, metadata: params.userMessageMetadata },
|
||
params.toolNames || tools.map(tool => tool.name),
|
||
);
|
||
```
|
||
|
||
- [ ] **Step 4: 验证编译**
|
||
|
||
Run: `cd packages/backend && npx tsc --noEmit --pretty 2>&1 | head -20`
|
||
Expected: 无错误
|
||
|
||
- [ ] **Step 5: 提交**
|
||
|
||
```bash
|
||
git add packages/backend/src/modules/netaclaw/runtime/prompt_builder.ts packages/backend/src/modules/netaclaw/runtime/agent.ts
|
||
git commit -m "feat(netaclaw): dynamic attachment hints based on available tools"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 12: 前端 — renderer-registry 和 message-item 图片渲染
|
||
|
||
**Files:**
|
||
- Modify: `packages/frontend/src/modules/agent/tools/renderer-registry.ts`
|
||
- Modify: `packages/frontend/src/modules/agent/components/message-item.vue`
|
||
|
||
- [ ] **Step 1: 扩展 renderer-registry 的 rawResult 类型**
|
||
|
||
在 `renderer-registry.ts` 第 9 行 `ToolRenderSource` 接口的 `rawResult` 字段中,扩展 `type` 联合类型和新增 `images` 字段:
|
||
|
||
```typescript
|
||
rawResult?: {
|
||
type: 'text' | 'json' | 'image' | 'images';
|
||
text?: string;
|
||
data?: unknown;
|
||
url?: string;
|
||
mimeType?: string;
|
||
width?: number;
|
||
height?: number;
|
||
bytes?: number;
|
||
originalWidth?: number;
|
||
originalHeight?: number;
|
||
originalBytes?: number;
|
||
resized?: boolean;
|
||
images?: {
|
||
url: string;
|
||
mimeType?: string;
|
||
width?: number;
|
||
height?: number;
|
||
seed?: number;
|
||
}[];
|
||
};
|
||
```
|
||
|
||
- [ ] **Step 2: 修改 message-item.vue 模板 — 升级单图渲染并新增多图**
|
||
|
||
找到 `message-item.vue` 第 75-83 行现有的单图渲染块:
|
||
|
||
```vue
|
||
<div
|
||
v-if="tool.rawResult?.type === 'image' && tool.rawResult.url"
|
||
class="tool-execution__image"
|
||
>
|
||
<img :src="tool.rawResult.url" @click="openToolImage(tool.rawResult.url)" />
|
||
<div class="tool-execution__image-caption">
|
||
{{ formatToolImageCaption(tool.rawResult) }}
|
||
</div>
|
||
</div>
|
||
```
|
||
|
||
替换为:
|
||
|
||
```vue
|
||
<div
|
||
v-if="tool.rawResult?.type === 'image' && tool.rawResult.url"
|
||
class="tool-execution__image"
|
||
>
|
||
<el-image
|
||
:src="tool.rawResult.url"
|
||
fit="contain"
|
||
:preview-src-list="[tool.rawResult.url]"
|
||
preview-teleported
|
||
class="tool-execution__image-single"
|
||
/>
|
||
<div class="tool-execution__image-caption">
|
||
{{ formatToolImageCaption(tool.rawResult) }}
|
||
</div>
|
||
</div>
|
||
<div
|
||
v-else-if="tool.rawResult?.type === 'images' && tool.rawResult.images?.length"
|
||
class="tool-execution__image"
|
||
>
|
||
<div v-if="tool.rawResult.text" class="tool-execution__image-caption">
|
||
{{ tool.rawResult.text }}
|
||
</div>
|
||
<div class="tool-execution__image-grid">
|
||
<el-image
|
||
v-for="(img, idx) in tool.rawResult.images"
|
||
:key="idx"
|
||
:src="img.url"
|
||
fit="cover"
|
||
:preview-src-list="tool.rawResult.images.map(i => i.url)"
|
||
:initial-index="idx"
|
||
preview-teleported
|
||
class="tool-execution__image-grid-item"
|
||
/>
|
||
</div>
|
||
</div>
|
||
```
|
||
|
||
- [ ] **Step 3: 新增多图网格样式**
|
||
|
||
在 `message-item.vue` 的 `<style>` 块中,找到现有的 `.tool-execution__image` 样式附近,新增:
|
||
|
||
```css
|
||
.tool-execution__image-single {
|
||
max-width: 360px;
|
||
border-radius: 8px;
|
||
}
|
||
.tool-execution__image-grid {
|
||
display: grid;
|
||
grid-template-columns: repeat(auto-fill, minmax(140px, 1fr));
|
||
gap: 8px;
|
||
margin-top: 8px;
|
||
}
|
||
.tool-execution__image-grid-item {
|
||
width: 100%;
|
||
aspect-ratio: 1;
|
||
border-radius: 8px;
|
||
cursor: pointer;
|
||
object-fit: cover;
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: 验证前端编译**
|
||
|
||
Run: `cd packages/frontend && npx vue-tsc --noEmit 2>&1 | head -20`
|
||
Expected: 无新增错误
|
||
|
||
- [ ] **Step 5: 提交**
|
||
|
||
```bash
|
||
git add packages/frontend/src/modules/agent/tools/renderer-registry.ts packages/frontend/src/modules/agent/components/message-item.vue
|
||
git commit -m "feat(frontend): upgrade image rendering and add multi-image grid in message-item"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 13: 前端 — 工具编辑页图片生成配置区块
|
||
|
||
**Files:**
|
||
- Modify: `packages/frontend/src/modules/agent/views/tools.vue`
|
||
|
||
- [ ] **Step 1: 在编辑抽屉中新增图片生成配置区块**
|
||
|
||
在 `tools.vue` 的编辑抽屉中,找到模型配置区域(约第 475-500 行 `<template v-if="editor.requiresModel === 1">` 块的 `</template>` 之后),新增:
|
||
|
||
```vue
|
||
<template v-if="isImageTool">
|
||
<el-divider>{{ t('图片生成配置') }}</el-divider>
|
||
<el-alert type="info" :closable="false" style="margin-bottom: 16px">
|
||
默认值在 Agent 未指定时生效,Agent 可根据用户指令覆盖。硬上限不可突破。
|
||
</el-alert>
|
||
<el-row :gutter="16">
|
||
<el-col :span="8">
|
||
<el-form-item label="默认数量">
|
||
<el-input-number v-model="imageDefaults.n" :min="1" :max="9" :step="1" style="width: 100%" />
|
||
</el-form-item>
|
||
</el-col>
|
||
<el-col :span="8">
|
||
<el-form-item label="默认比例">
|
||
<el-select v-model="imageDefaults.aspectRatio" clearable placeholder="不限" style="width: 100%">
|
||
<el-option label="1:1" value="1:1" />
|
||
<el-option label="16:9" value="16:9" />
|
||
<el-option label="4:3" value="4:3" />
|
||
<el-option label="3:2" value="3:2" />
|
||
<el-option label="2:3" value="2:3" />
|
||
<el-option label="3:4" value="3:4" />
|
||
<el-option label="9:16" value="9:16" />
|
||
</el-select>
|
||
</el-form-item>
|
||
</el-col>
|
||
<el-col :span="8">
|
||
<el-form-item label="默认水印">
|
||
<el-switch v-model="imageDefaults.watermark" />
|
||
</el-form-item>
|
||
</el-col>
|
||
</el-row>
|
||
<el-row :gutter="16">
|
||
<el-col :span="8">
|
||
<el-form-item label="最大数量">
|
||
<el-input-number v-model="imageConstraints.maxN" :min="1" :max="9" :step="1" style="width: 100%" />
|
||
</el-form-item>
|
||
</el-col>
|
||
<el-col :span="8">
|
||
<el-form-item label="最大宽度">
|
||
<el-input-number v-model="imageConstraints.maxWidth" :min="512" :max="4096" :step="64" style="width: 100%" />
|
||
</el-form-item>
|
||
</el-col>
|
||
<el-col :span="8">
|
||
<el-form-item label="最大高度">
|
||
<el-input-number v-model="imageConstraints.maxHeight" :min="512" :max="4096" :step="64" style="width: 100%" />
|
||
</el-form-item>
|
||
</el-col>
|
||
</el-row>
|
||
</template>
|
||
```
|
||
|
||
- [ ] **Step 2: 新增响应式数据和计算属性**
|
||
|
||
在 `<script>` 区域的 `editor` reactive 对象附近新增:
|
||
|
||
```typescript
|
||
const isImageTool = computed(() =>
|
||
['text_to_image', 'image_to_image'].includes(editor.name)
|
||
);
|
||
|
||
const imageDefaults = reactive({
|
||
n: 1,
|
||
aspectRatio: '' as string,
|
||
watermark: false,
|
||
});
|
||
|
||
const imageConstraints = reactive({
|
||
maxN: 9,
|
||
maxWidth: 2048,
|
||
maxHeight: 2048,
|
||
});
|
||
```
|
||
|
||
- [ ] **Step 3: 在打开编辑抽屉时加载 extra 中的图片配置**
|
||
|
||
在打开编辑抽屉的逻辑中(`watch(editorVisible, ...)` 或 `openEditor` 函数),新增:
|
||
|
||
```typescript
|
||
const extra = row.extra as Record<string, any> ?? {};
|
||
if (extra.imageDefaults) {
|
||
Object.assign(imageDefaults, { n: 1, aspectRatio: '', watermark: false, ...extra.imageDefaults });
|
||
} else {
|
||
Object.assign(imageDefaults, { n: 1, aspectRatio: '', watermark: false });
|
||
}
|
||
if (extra.imageConstraints) {
|
||
Object.assign(imageConstraints, { maxN: 9, maxWidth: 2048, maxHeight: 2048, ...extra.imageConstraints });
|
||
} else {
|
||
Object.assign(imageConstraints, { maxN: 9, maxWidth: 2048, maxHeight: 2048 });
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: 在保存时将图片配置写入 extra**
|
||
|
||
在保存编辑的逻辑中(`handleSave` 或 `handleUpdate` 函数),构造 extra 时新增:
|
||
|
||
```typescript
|
||
const extra: Record<string, unknown> = {
|
||
...(editor.governancePolicy?.allowInSubagent !== undefined ? { allowInSubagent: editor.governancePolicy.allowInSubagent } : {}),
|
||
...(editor.governancePolicy?.workerRoutingStrategy ? { workerRoutingStrategy: editor.governancePolicy.workerRoutingStrategy } : {}),
|
||
};
|
||
if (isImageTool.value) {
|
||
extra.imageDefaults = { ...imageDefaults };
|
||
extra.imageConstraints = { ...imageConstraints };
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 5: 验证前端编译**
|
||
|
||
Run: `cd packages/frontend && npx vue-tsc --noEmit 2>&1 | head -20`
|
||
Expected: 无新增错误
|
||
|
||
- [ ] **Step 6: 提交**
|
||
|
||
```bash
|
||
git add packages/frontend/src/modules/agent/views/tools.vue
|
||
git commit -m "feat(frontend): add image generation config section in tool editor"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 14: 端到端验证
|
||
|
||
- [ ] **Step 1: 启动后端**
|
||
|
||
Run: `cd packages/backend && npm run dev`
|
||
Expected: 启动成功,无报错
|
||
|
||
- [ ] **Step 2: 验证工具同步到数据库**
|
||
|
||
打开前端工具管理页,点击"同步工具目录",确认 `text_to_image` 和 `image_to_image` 出现在列表中,toolset 为 `vision`,capability 为 `multimodal`,requiresModel 为 1。
|
||
|
||
- [ ] **Step 3: 配置模型渠道**
|
||
|
||
在工具编辑页为 `text_to_image` 配置模型渠道(选择已有的火山引擎或 MiniMax 渠道),选择对应的图片生成模型。
|
||
|
||
- [ ] **Step 4: 配置图片生成参数**
|
||
|
||
在工具编辑页的"图片生成配置"区块,设置默认值和硬上限,保存。
|
||
|
||
- [ ] **Step 5: 创建测试 Agent**
|
||
|
||
在 Agent 编辑页创建一个测试 Agent,工具集启用 `text_to_image`、`image_to_image`、`image_recognize`。
|
||
|
||
- [ ] **Step 6: 测试文生图**
|
||
|
||
在对话页向测试 Agent 发送"生成一张白底电商主图,蓝牙耳机",确认:
|
||
- Agent 调用 `text_to_image` 工具
|
||
- tool-card 渲染出生成的图片
|
||
- 图片 URL 是本地持久化 URL(非临时 URL)
|
||
|
||
- [ ] **Step 7: 测试图生图**
|
||
|
||
上传一张产品图片作为附件,发送"基于这张图生成一张白底主图",确认:
|
||
- 附件提示语包含 `image_to_image` 工具提示
|
||
- Agent 调用 `image_to_image` 工具,referenceImage 为上传图片的 URL
|
||
- tool-card 渲染出生成的图片
|
||
|
||
- [ ] **Step 8: 测试多图**
|
||
|
||
发送"生成 3 张不同角度的产品图",确认:
|
||
- Agent 调用 `text_to_image` 时 n=3
|
||
- tool-card 以网格布局渲染 3 张图片
|
||
- 点击图片可预览大图
|
||
|
||
- [ ] **Step 9: 最终提交**
|
||
|
||
确认所有功能正常后,如有遗漏修复一并提交。
|