human-taste
npx skills add https://github.com/alpha-mintamir/human-taste-skill --skill human-taste
Agent 安装分布
Skill 文档
Human Taste
Evaluate UX and product design through human taste — the trained judgment that detects whether a design reduces cognitive friction, feels coherent, and fits its audience.
This skill is grounded in research from cognitive psychology, HCI, and design practice. For full citations see references/research-sources.md.
Why This Matters
LLMs can generate designs, but aesthetic judgment involves empathy, cultural awareness, and pattern recognition that require human-calibrated evaluation. Research shows:
- Users form aesthetic impressions within milliseconds (eye-tracking studies)
- Interfaces that reduce cognitive load are perceived as more beautiful (Processing Fluency Theory)
- Taste develops through repeated exposure and operates at a pre-conscious perceptual level
- Good taste means choosing simplicity over mere familiarity (Hickey’s Simple vs Easy)
This skill provides a structured protocol so agents can approximate that judgment systematically.
Quick Start
When asked to evaluate a design:
- Identify what you are evaluating — screenshot, wireframe, live page, component, or described flow
- Run the rubric below across all six dimensions
- Produce a Human Taste Report using the output template
- Cite specific elements — never give vague praise or criticism
Evaluation Rubric
Score each dimension 1-5. Anchor your score with concrete evidence from the design.
1. Cognitive Load (weight: high)
Does the design minimize unnecessary mental effort?
| Score | Meaning |
|---|---|
| 1 | Overwhelming — too many competing elements, no clear entry point |
| 2 | Heavy — user must work to understand the hierarchy |
| 3 | Moderate — some unnecessary complexity but functional |
| 4 | Light — clear hierarchy, minimal distractions |
| 5 | Effortless — information is exactly where you expect it |
Look for: element count per view, competing focal points, label clarity, progressive disclosure, information grouping.
2. Visual Coherence (weight: high)
Does the design feel unified rather than assembled from parts?
| Score | Meaning |
|---|---|
| 1 | Fragmented — inconsistent spacing, colors, typography |
| 2 | Patchy — some consistency but noticeable breaks |
| 3 | Adequate — follows a system with minor deviations |
| 4 | Cohesive — strong visual rhythm, clear design system |
| 5 | Seamless — every element reinforces the whole |
Look for: spacing consistency, color palette discipline, typographic scale, alignment grid, icon style unity.
3. Interaction Clarity (weight: high)
Can a user predict what happens next at every step?
| Score | Meaning |
|---|---|
| 1 | Opaque — controls are ambiguous, outcomes unclear |
| 2 | Confusing — some actions have surprising results |
| 3 | Functional — most flows are predictable |
| 4 | Clear — affordances are obvious, feedback is immediate |
| 5 | Intuitive — zero learning curve, flows feel inevitable |
Look for: button labels, hover/focus states, loading indicators, error messages, navigation predictability, undo availability.
4. Context Fit (weight: medium)
Does the design match its audience and environment?
| Score | Meaning |
|---|---|
| 1 | Mismatch — tone, density, or style wrong for the audience |
| 2 | Off — partially appropriate but feels generic |
| 3 | Acceptable — reasonable for the context |
| 4 | Tailored — shows awareness of user needs and setting |
| 5 | Perfect fit — feels like it was made for exactly this audience |
Look for: reading level, information density vs audience expertise, platform conventions, accessibility, cultural appropriateness.
5. Restraint (weight: medium)
Does the design know what to leave out?
| Score | Meaning |
|---|---|
| 1 | Bloated — every feature is visible, nothing is prioritized |
| 2 | Cluttered — too many options competing for attention |
| 3 | Balanced — reasonable feature surface |
| 4 | Disciplined — clear priorities, secondary items recede |
| 5 | Minimal — only the essential, nothing to remove |
Look for: feature density, progressive disclosure, empty states, whitespace usage, hidden-by-default patterns.
6. Emotional Response (weight: low)
Does the design evoke the intended feeling?
| Score | Meaning |
|---|---|
| 1 | Repellent — actively unpleasant |
| 2 | Flat — no emotional register |
| 3 | Neutral — inoffensive |
| 4 | Warm — creates mild positive engagement |
| 5 | Delightful — memorable, evokes trust or joy |
Look for: micro-interactions, illustration style, copy tone, color warmth, motion design, personality.
Output Template
Produce your evaluation in this format:
# Human Taste Report
**Subject:** [what was evaluated]
**Date:** [date]
**Overall Score:** [weighted average, 1-5, one decimal] / 5
## Scores
| Dimension | Score | Key Evidence |
|-----------|-------|-------------|
| Cognitive Load | X/5 | [specific observation] |
| Visual Coherence | X/5 | [specific observation] |
| Interaction Clarity | X/5 | [specific observation] |
| Context Fit | X/5 | [specific observation] |
| Restraint | X/5 | [specific observation] |
| Emotional Response | X/5 | [specific observation] |
## Strengths
- [concrete strength with evidence]
- [concrete strength with evidence]
## Issues
- **[severity: Critical/Major/Minor]**: [specific issue] -- [why it matters] -- [suggested fix]
## Verdict
[2-3 sentence summary: what works, what does not, and the single highest-impact improvement]
Weighted average formula: (CognitiveLoad*3 + VisualCoherence*3 + InteractionClarity*3 + ContextFit*2 + Restraint*2 + EmotionalResponse*1) / 14
Comparing Alternatives
When comparing two or more designs:
- Run the rubric on each independently
- Add a Comparison Table showing side-by-side scores
- Declare a winner per dimension and overall
- Explain the tradeoffs — a lower-scoring design may still be right for a specific audience
Reviewing AI-Generated Designs
AI-generated UI often has specific taste failure modes:
- Over-decoration — gradients, shadows, and effects without purpose
- Generic composition — layouts that feel template-driven rather than content-driven
- Inconsistent density — mixing spacious and cramped sections
- Missing edge states — empty states, error states, loading states not considered
- Surface polish without structural clarity — looks good at first glance but confusing to use
Flag these explicitly when you detect them.
When Not to Use This Skill
- Pure backend/API design with no user-facing component
- Code review for logic correctness (use a code-review skill instead)
- Accessibility audits (this skill covers taste, not WCAG compliance — though the two overlap)
Additional Resources
- For full research citations and sources, see references/research-sources.md
- For worked examples of the rubric in action, see examples.md