【解构】 Claude 同模型双人格架构：对比 Anthropic 通用版与 Design 版 System Prompt 的工程差异

news2026/5/6 22:45:49

关键词Claude Opus 4.7 | Claude Design | System Prompt | Agent 架构 | Prompt Engineering | Multi-Persona你读完能得到Anthropic 如何用同一个模型两份 prompt 做出两个产品的完整分析7 个工程维度的对照表身份/主动性/提问/格式/变体/验证/版权一份 Agent 人格工程 Checklist直接可复制的代码级实现建议一、背景一个值得关注的架构决策大多数 AI 产品团队在设计多场景 AI 时会做一个关键选择方案 A训练多个专用模型成本高、周期长、但效果聚焦方案 B用同一个基础模型不同 system prompt成本低、灵活、但区分度未知Anthropic 选了方案 B。他们用同一个 Claude Opus 4.7 基座通过两份截然不同的 system prompt派生出两个产品通用版 Claudeclaude.ai 聊天界面、APIDesign 版 ClaudeClaude Artifacts、设计产品线两份 prompt 都通过 CL4R1T4S 仓库泄露公开。本文将并行拆解两份文件从工程角度抽象出**如何用 prompt 做人格分化**的可复制方法论。适合读者正在构建多场景 AI 产品客服分析、代码设计等考虑是否需要为不同场景训练专用模型关注 AI Agent 人格化、角色化设计二、两份 Prompt 的基础参数维度Claude 通用版Claude Design 版文件Claude-Opus-4.7.txtClaude-Design-Sys-Prompt.txt行数1408 行422 行文件大小149KB~18KB底层模型Claude Opus 4.7Claude Opus 4.7同主要段落13 个大段behavior / copyright / skills / memory / …10 个大段workflow / output guidelines / design process / …预设工具数少量核心工具 tool_search延迟加载30 固定工具全暴露观察Design 版只有通用版 30% 的篇幅但工具数更多、更固定。这反映了两者的设计哲学差异——通用版求广求稳、Design 版求深求专。三、7 个工程维度的并列对照这是本文核心。每个维度都会给出(1) 两版原文 (2) 工程含义 (3) 可迁移实现。维度 1身份定义Identity通用版You are Claude, made by Anthropic. You are a helpful, harmless, and honest AI assistant.Design 版You are an expert designer working with the user as a manager. You produce design artifacts on behalf of the user using HTML. You operate within a filesystem-based project. ...begin your html file with some assumptions context design reasoning, as if you are a junior designer and the user is your manager.工程含义通用版身份形容词组合helpful, harmless, honestDesign 版身份社会角色关系设计师用户是老板第二种设计显著更强。Agent 拿到的不是抽象标签而是完整的社交脚本——该汇报什么、什么时候该问、怎么提出异议、交付物是什么。可迁移实现# 反面教材BAD_IDENTITYYou are a helpful PM assistant.# 推荐写法GOOD_IDENTITY You are a senior product manager at a Series B SaaS company. Your reporting relationship: - Your manager is the CTO (the user) - You manage 2 junior developers (other agents or subordinates) Your deliverables: - Product Requirements Documents (PRDs) that devs can implement without asking back - Written in markdown with specific acceptance criteria Your working style: - As if you are mid-level PM reporting to your CTO boss - Propose, dont decide. Challenge assumptions, but defer final call to the user. - When requirements are unclear, you MUST ask clarifying questions before producing deliverables. 维度 2主动性Proactivity通用版Claude does its best to address the persons query, even if ambiguous, before asking for clarification or additional information.即使问题模糊也先尝试回答不急着反问。属于被动响应模式。Design 版If stuck, try listing design assets, lsing design systems files -- be proactive! Some designs may need multiple design systems -- get them all! You should also use the starter components to get high-quality things like device frames for free.关键词be proactive!明令要求主动探索。遇阻时要主动 ls 文件、翻资源、试 starter components。工程含义主动性是需要显式授权的。如果不写LLM 默认会保守。可迁移实现PROACTIVITY_SECTION ## Initiative Level: HIGH When you encounter uncertainty, DO NOT wait for user input. Instead: 1. **Explore first**: Use list_files, grep, read_file to understand context before asking 2. **Hypothesize second**: Form your best guess and state it explicitly (My assumption: X. Proceeding unless you object.) 3. **Ask last**: Only ask the user when exploration hypothesis fails to resolve ambiguity When stuck: - List relevant directories to see whats available - Search codebase for similar patterns - Try the most likely approach with low-risk tools first - Report findings, dont report blocks 维度 3提问密度Question Density通用版In general conversation, Claude doesnt always ask questions, but when it does it tries to avoid overwhelming the person with more than one question per response.Design 版Use the questions_v2 tool when starting something new or the ask is ambiguous — one round of focused questions is usually right. Tips: - Always ask whether theyd like variations, and for which aspects - Always ask whether the user wants divergent visuals or interactions - Ask at least 4 other problem-specific questions - Ask at least 10 questions, maybe more.差异悬殊每次最多一问 vs 开场至少十问。工程含义提问密度应该是任务模糊度的函数而不是一刀切。可迁移实现defquestion_policy_by_task(task_type:str)-dict:根据任务类型配置提问策略policies{factual_query:{max_questions:0,trigger:never ask, answer directly},simple_task:{max_questions:1,trigger:only if critical info is missing},code_implementation:{max_questions:2,trigger:ask about edge cases and error handling},design_from_scratch:{max_questions:10,trigger:ALWAYS ask before producing anything},architecture_design:{max_questions:8,trigger:ask about scale, constraints, non-functionals},}returnpolicies[task_type]维度 4格式偏好Formatting Preferences通用版极严格Claude should not use bullet points or numbered lists for reports, documents, explanations, or unless the person explicitly asks for a list or ranking. For reports, documents, technical documentation, and explanations, Claude should instead write in prose and paragraphs without any lists.Design 版通篇没有格式约束。工程含义不同 Agent 的输出物本质不同评价标准应该不同Agent 类型输出物质量评判标准通用对话文本可读性、简洁、不 slop设计HTML/设计稿视觉品位、可用性、变体代码代码能跑、干净、有测试分析报告逻辑、数据、洞察别用统一 prompt 约束所有 Agent。维度 5变体策略Variation Strategy通用版无相关条款。Design 版Give options: try to give 3 variations across several dimensions. Mix by-the-book designs that match existing patterns with new and novel interactions. Start your variations basic and get more advanced and creative as you go! The goal here is not to give users the perfect option; its to explore as many atomic variations as possible, so the user can mix and match and find the best ones.核心理念最后一句是精髓——不是给完美方案而是给足够多的原子材料让用户自己拼装。工程含义创造性任务本质是搜索问题在可能性空间里找好方案不是优化问题把一个方案做到极致。可迁移实现# 适用场景任何创造性 Agent文案、设计、代码方案、策略VARIATION_POLICY ## Multi-Variant Generation Policy For any creative or open-ended task, default to producing 3 variants: Variant A (Safe/Standard): - Follows established patterns in the codebase/domain - Low-risk, proven approach - Reasonable trade-offs Variant B (One-Dimensional Deviation): - Keeps 80% of Variant A - Deviates boldly on ONE specific dimension (e.g., different algorithm, different tech, different style) - Useful for A/B comparison Variant C (Creative/Experimental): - Novel approach - Higher risk, potentially higher reward - Explore whats possible Present all 3 with trade-off analysis. Let the user mix and match. 维度 6验证机制Verification通用版Claude 自己检查自己的输出自检模式。Design 版Do not perform your own verification before calling done; do not proactively grab screenshots to check your work; rely on the verifier to catch issues without cluttering your context. Once done reports clean, call fork_verifier_agent. It spawns a background subagent with its own iframe to do thorough checks (screenshots, layout, JS probing). Silent on pass — only wakes you if somethings wrong.工程含义主 Agent 的上下文已被任务历史填满“自检等于让考生批自己的卷子”。正确做法是 fork 一个独立上下文的 verifier subagent做检查。架构示意┌─────────────────────────┐ │ Main Design Agent │ │ (上下文: 任务历史) │ └──────────┬──────────────┘ │ call done() ↓ ┌─────────────────────────┐ │ fork_verifier_agent │ ← 独立进程 │ (上下文: 全新产物) │ │ - 截图 │ │ - 跑 JS 检查 │ │ - 读 console logs │ └──────────┬──────────────┘ │ 只在发现问题时回报 ↓ [silent pass] or [issue report]可迁移实现Python/伪代码asyncdefrun_with_independent_verifier(agent,task):# 主 Agent 做事resultawaitagent.execute(task)# fork 独立验证器新的 LLM session、干净上下文verifiernew_agent(system_promptVERIFIER_PROMPT,context[],# 故意不传主 Agent 的对话历史toolsVERIFICATION_TOOLS,)# 只给验证器看最终产物verdictawaitverifier.verify(artifactresult.artifact,acceptance_criteriatask.criteria,)ifverdict.has_issues:# 把问题反馈给主 Agent让它修复returnawaitagent.fix(result,verdict.issues)else:returnresult这是企业级 Agent 系统必备的模式。单一 Agent 既做开发又做 QA在生产环境几乎必翻车。维度 7版权约束Copyright Compliance通用版CRITICAL_COPYRIGHT_COMPLIANCE段占 80 行三条LIMIT 7 条自检。Design 版几乎不提版权。工程含义约束不是越多越好应按场景配置。给设计 Agent 加 80 行版权条款 → 浪费上下文、降低响应速度给通用 Agent 减版权条款 → 法律风险爆炸。原则每个 Agent 的 system prompt 只包含它所在场景下真正相关的约束。四、7 维度对照总表一图带走#维度通用 ClaudeDesign Claude设计原则1身份抽象标签helpful/honest社会角色设计师老板角色关系形容词2主动性被动响应主动探索明令授权否则 LLM 保守3提问最多 1 问最少 10 问密度任务模糊度函数4格式禁 bullet无约束按输出物本质评判5变体无强制 3创造搜索不是优化6验证自检外包 verifier做审分离独立上下文7版权80 行硬顶几乎不提按场景配置约束五、Agent 人格工程 Checklist基于 7 维度对比抽象给出通用 Checklist。你设计新 Agent 时可逐项对照## 角色定义层 [ ] 身份是一个具体角色不是形容词 [ ] 声明汇报对象manager is X [ ] 声明下属/协作者works with Y [ ] 明确交付物类型produces Z [ ] 工作风格as if you are a ... ## 行为默认层 [ ] 主动性级别低/中/高 [ ] 提问密度策略按任务类型分档 [ ] 遇阻时默认行为探索/假设/等待 [ ] 对模糊问题的处理方式 ## 输出约束层 [ ] 格式偏好prose / list / 视场景 [ ] 长度偏好brief / detailed / auto [ ] 变体数量1 个还是 3 个 [ ] 完成后是否总结 ## 安全与合规层 [ ] 场景相关的硬限制版权/PII/毁坏操作 [ ] 前置自检清单生成前必答的问题 [ ] 约束精简不复制无关约束 ## 质量保证层 [ ] 是否启用独立 verifier subagent [ ] 验证器的 acceptance criteria [ ] 失败时的修复循环机制 [ ] 最大重试次数 ## 记忆与上下文层 [ ] 是否有跨会话记忆 [ ] 记忆检索的触发条件语言学信号等 [ ] 上下文压缩策略 ## 工具能力层 [ ] 核心工具清单精简 [ ] 是否启用 tool_search 元工具 [ ] 工具调用的并行/串行策略六、实战建议什么时候该分化 Agent 人格并非所有多场景 AI 都需要多人格。以下是判断矩阵应该分化的信号✅ 不同场景的输出物类型根本不同文本 vs 代码 vs 设计稿✅ 不同场景的提问密度应该天差地别客服 vs 产品分析✅ 不同场景的安全约束场景相关医疗 vs 娱乐✅ 不同场景的工具集差异大50% 不重叠不需要分化的信号❌ 只是语气差异正式 vs 轻松→ user prompt 控制即可❌ 只是话题领域不同编程 vs 写作→ system prompt 里加 domain knowledge 即可❌ 只是输出长度偏好不同 → instruction 层面解决经验法则如果你想不到两个场景有至少 3 个维度的本质差异就别分化保持单 Agent。七、FAQQ1: 两份 prompt 都是同一个 Claude Opus 4.7 模型怎么证明通用版 prompt 原文自述This iteration of Claude is Claude Opus 4.7 from the Claude 4.7 model family.Design 版没有显式声明模型但从 Anthropic 公开的产品架构看Artifacts 是基于 Opus 的能力以及它对claude-haiku-4-5的引用方式判断主 Agent 确定是 Opus 系列。交叉验证可信度高。Q2: 为什么 Anthropic 不训练一个专用的 Design 模型经济性考量。训练新模型海量数据大量 GPU 数月周期最终只在特定场景有用。写一份 prompt几周产品经理工作立刻上线随时可改。在质量足够的前提下prompt 工程的 ROI 碾压专项训练。这对应用侧产品的启示非常明确。Q3: 我的产品用 GPT-4 / 开源模型这些原则适用吗核心原则身份角色化、主动性授权、提问密度分档、做审分离是模型无关的任何 instruction-following LLM 都适用。具体写法需要根据模型特性微调比如 GPT-4 对简洁指令敏感开源模型往往需要更多 few-shot 示例。Q4: 有没有开源项目实装这些模式可以参考几个值得关注的Devin 2.0 system promptCL4R1T4S/DEVIN/— Planning/Standard/Edit 三态机Cursor 2.0 system promptCL4R1T4S/CURSOR/— IDE 代码助手的精简工具集Manus promptCL4R1T4S/MANUS/— Event Stream 四模块解耦Hermes AgentGitHub NousResearch/hermes-agent, 84k stars— ContextEngine 插件化、记忆 on_pre_compress 钩子八、参考资料CL4R1T4S 仓库https://github.com/elder-plinius/CL4R1T4S本文分析对象ANTHROPIC/Claude-Opus-4.7.txt(1408 行通用版)ANTHROPIC/Claude-Design-Sys-Prompt.txt(422 行Design 版)前置阅读《Claude Opus 4.7 系统提示词深度拆解从 1408 行指令中逆向出的 5 条 Prompt 工程实践》上一篇推荐扩展阅读Constitutional AI: Harmlessness from AI Feedback(Anthropic)Reflexion: Language Agents with Verbal Reinforcement LearningAnthropic 官方博客 “Claude’s Constitution”笔者背景AI Agent 工程实践者持续在构建 OpenClaw 多 Agent 协作系统。本文分析中的迁移实践均在 OpenClaw 项目内实测验证。本文源材料可复查位置笔者本地CL4R1T4S-main/ANTHROPIC/Claude-Opus-4.7.txt # 1408 lines, 149KB CL4R1T4S-main/ANTHROPIC/Claude-Design-Sys-Prompt.txt # 422 lines, 18KB如果本文对你设计 Agent 有帮助欢迎点赞/收藏/讨论。特别欢迎留言分享你在多 Agent 人格化设计中踩过的坑。

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.coloradmin.cn/o/2534914.html

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈，一经查实，立即删除！