prompt-engineering

📁 oakoss/agent-skills 📅 4 days ago

总安装量

周安装量

#34130

全站排名

安装命令

npx skills add https://github.com/oakoss/agent-skills --skill prompt-engineering

Agent 安装分布

claude-code 7

mcpjam 5

kilo 5

junie 5

windsurf 5

zencoder 5

Skill 文档

Prompt Engineering

Advanced prompt design for LLMs and autonomous agents. Covers reasoning patterns, template systems, optimization workflows, agentic orchestration, extended thinking, and tool use prompting.

When to use: Designing prompts that require structured reasoning, building agent loops, optimizing LLM output quality, creating reusable prompt templates, configuring extended thinking for complex tasks, or designing multimodal prompts with images and text.

When NOT to use: Simple factual queries, direct lookups, or creative writing that benefits from open-ended generation.

Key Principles

Explicit over implicit — Modern models (Claude 4.x, GPT-4.1) follow instructions literally. Be specific about desired output, format, and behavior rather than relying on the model to infer intent.
Objective over instruction — For reasoning models (OpenAI o-series, Claude with extended thinking), state the goal rather than prescribing step-by-step methods. These models plan natively.
Structure signals intent — Use XML tags, clear delimiters, and consistent formatting to communicate prompt structure. Models trained on structured prompts parse them more reliably than plain text.
One good example beats many rules — Few-shot examples with consistent formatting anchor model behavior more effectively than verbose instructions.
Feedback loops are built-in — Design prompts that ask the model to verify, critique, or score its own output before finalizing.
Token economy matters — Every extra token adds latency and cost. Compress context, remove filler, and front-load critical information.

Model-Specific Considerations

Claude 4.x models follow instructions with high precision. They take prompts literally and do exactly what is asked — no more, no less. Use XML tags to structure prompt sections (<rules>, <context>, <output_format>). Frame instructions positively (describe what to do, not what to avoid). Provide context or motivation behind instructions so Claude can generalize. Extended thinking and interleaved thinking provide native reasoning capabilities.

OpenAI o-series models (o3, o4-mini) use internal reasoning tokens before responding. Use developer messages instead of system messages. Write detailed function descriptions as interface contracts. Do not add explicit reasoning prompts — these models reason natively and additional planning prompts can hurt performance. Pass back persisted reasoning items for multi-turn conversations.

GPT-4.1 and standard models benefit from explicit step-by-step instructions, few-shot examples, and structured output schemas. These models do not have native reasoning loops, so CoT prompting and structured thinking protocols add measurable value.

Multimodal models (GPT-4o, Claude with vision, Gemini) accept images alongside text. Provide context about what each image represents, use clear action verbs, and crop images to relevant regions. Label multiple images explicitly and specify their relationship.

Quick Reference

Pattern	API / Technique	Key Point
Zero-shot CoT	`"Let's think step by step"` trigger	Elicits reasoning without examples
Few-shot CoT	Explicit reasoning chain examples	One good example beats many rules
Self-consistency	Multiple paths + majority vote	Higher accuracy on complex tasks
Tree-of-Thoughts	Generate 3+ strategies, eliminate weakest	Parallel exploration with pruning; high cost
ReAct loop	Thought-Action-Observation cycle	Agent reasons and acts in unison
System prompt	Role + Expertise + Guidelines + Format	Foundation for all LLM behavior
Prompt template	Modular composition with variable slots	Reusable, validated, cacheable
A/B testing	Statistical comparison of prompt variants	Isolate variables, measure significance
Extended thinking	Budget-controlled deep reasoning (Claude)	Let model think before responding
Interleaved thinking	Think between tool calls (Claude 4)	Reason after each tool result
Think tool	No-op tool for structured reasoning space	Gives agents a place to reason mid-turn
Reasoning models	Objective-based prompting for o3/o4-mini	Let the model plan its own reasoning
Structured thinking	Understanding-Analysis-Execution protocol	Forces verification before acting
XML structuring	Tags to delimit prompt sections	Models parse structured prompts reliably
Multimodal prompting	Text + image context for vision models	Provide spatial context and clear action verbs
Confidence scoring	Model self-reports certainty per claim	Quantifies reliability of output
Token optimization	Compress context, remove filler words	Reduce latency and cost

Common Mistakes

Mistake	Correct Pattern
Overloading a single prompt with too many instructions	Use hierarchical rules with clear priority ordering
Forcing rigid step-by-step on reasoning models	Use objective-based prompts; reasoning models plan natively
Setting max output tokens too low for reasoning models	Allocate sufficient tokens for internal chain-of-thought
Using static examples for complex tasks	Select examples dynamically via semantic similarity
Inconsistent formatting across few-shot examples	All examples must follow identical input-output structure
Manually parsing unstructured LLM output	Use JSON mode or structured output schemas
Ignoring token budget allocation	Reserve tokens for system prompt, examples, input, and response
Skipping baseline measurement before optimizing	Establish metrics first, then change one variable at a time
Using CoT prompts on reasoning models	Redundant; these models reason natively without explicit triggers
Telling models what NOT to do instead of what to do	Frame instructions positively: describe the desired behavior
Passing thinking blocks back as user text (Claude)	Pass thinking blocks unmodified in assistant message only
Over-prompting reasoning models to “plan more”	Additional planning prompts can degrade reasoning model performance

Prompt Engineering Workflow

Define the objective — State what the prompt should achieve and how success is measured
Choose the right pattern — Match the task to CoT, ReAct, ToT, or simple prompting based on complexity
Select the model tier — Route to lightweight, standard, or reasoning models based on task difficulty
Write the baseline prompt — Start simple; use system prompt structure with XML tags for complex cases
Add examples — Include 1-3 few-shot examples with consistent formatting if the task requires them
Test and measure — Establish baseline metrics (accuracy, latency, token usage) on representative inputs
Analyze failures — Categorize errors (format, factual, logical, incomplete) and address the most impactful
Iterate one variable — Change one element at a time to isolate what improves performance
Version and deploy — Track prompt versions alongside performance data for rollback capability

Delegation

Explore prompt variants and compare model responses: Use Explore agent to test prompt strategies across different inputs
Build multi-step agentic workflows with tool use: Use Task agent to implement and validate ReAct loops and autonomous chains
Design hierarchical prompt architecture for complex systems: Use Plan agent to structure prompt systems with verification loops

If the expert-instruction skill is available, delegate system prompt design and agent persona crafting to it.

References

Chain-of-Thought — Step-by-step reasoning, self-consistency, least-to-most decomposition
Few-Shot Learning — Example selection strategies, token-aware truncation, edge cases
Prompt Templates — Template architecture, modular composition, validation, caching
Prompt Optimization — A/B testing, failure analysis, metrics, version control
System Prompts — Role definition, constraint specification, dynamic adaptation
Reasoning Model Optimization — Objective-based prompting for o3/o4-mini, extended thinking configuration
Tree-of-Thoughts — Parallel branch exploration, evaluation, synthesis
ReAct Patterns — Thought-Action-Observation loop, tool discovery, error recovery
Structured Thinking — Adversarial critic protocol, confidence scoring, metadata tagging
Extended Thinking and Tool Use — Budget configuration, interleaved thinking, think tool pattern
Multimodal Prompting — Vision model techniques, image context, cross-modal alignment

GitHub 仓库 ↗ ← 返回陌讯 Skills 聚合平台