trust-but-verify
3
总安装量
2
周安装量
#55446
全站排名
安装命令
npx skills add https://github.com/wojons/skills --skill trust-but-verify
Agent 安装分布
amp
2
cline
2
opencode
2
cursor
2
kimi-cli
2
codex
2
Skill 文档
Trust But Verify
Apply skeptical verification to all system claims, test results, and assumptions through independent validation rather than blind trust, preventing false confidence and accelerating issue detection.
When to use me
Use this skill when:
- Tests pass but you suspect something might still be wrong
- Documentation claims features work but you want to verify
- System memory/brain/progress tracking says something is built
- Stakeholders assume functionality exists based on reports
- You need to validate assumptions before critical decisions
- Building resilience against false positives and blind spots
- Preparing for production releases or high-risk changes
- Onboarding to a system with uncertain quality signals
What I do
1. Claim Identification
-
Extract claims from:
- Test results and coverage reports
- System documentation and specifications
- Progress tracking and memory systems
- Stakeholder expectations and assumptions
- Deployment logs and monitoring dashboards
- Team communications and status updates
-
Categorize claims by:
- Criticality (mission-critical vs nice-to-have)
- Verifiability (easily testable vs ambiguous)
- Source credibility (trusted source vs unknown)
- Time since last verification (fresh vs stale)
2. Verification Strategy Design
-
Select appropriate verification methods:
- Independent test execution (different from original tests)
- System probing and health checks
- User scenario simulation
- Data validation and integrity checks
- Performance benchmarking
- Security penetration testing
- Documentation vs implementation comparison
-
Coordinate with other test types:
- Use unit tests but run them differently
- Run integration tests with different data
- Execute E2E tests with edge cases
- Perform chaos testing to verify resilience claims
- Conduct usability testing to verify user experience claims
3. Skeptical Verification Execution
-
Challenge assumptions deliberately:
- What if the test is testing the wrong thing?
- What if the test passes for the wrong reason?
- What if the feature works but not as users expect?
- What if the system works now but won’t under load?
- What if documentation diverges from implementation?
-
Execute verification with different contexts:
- Different environments (not just test environment)
- Different data sets (not just test data)
- Different user personas (not just happy path)
- Different time periods (not just immediate)
- Different failure conditions (not just success paths)
4. Discrepancy Detection & Reporting
-
Compare claims vs verification results:
- Identify false positives (claims true but verification fails)
- Identify false negatives (claims false but verification passes)
- Measure divergence magnitude (minor vs critical differences)
- Track verification confidence levels
-
Generate actionable insights:
- Specific discrepancies found
- Root cause hypotheses
- Impact assessment
- Priority recommendations
- Verification method effectiveness
Verification Strategies by Claim Type
For “Tests Pass” Claims:
- Verify test quality: Are tests actually testing the right thing?
- Check test coverage: Do tests cover critical paths and edge cases?
- Review test data: Is test data realistic and comprehensive?
- Execute alternative tests: Run similar but different verification tests
- Check test environment: Does test environment match production?
For “Feature Built” Claims:
- Verify functionality: Does feature actually work as described?
- Check user experience: Is feature usable and intuitive?
- Validate integration: Does feature work with other components?
- Test edge cases: How does feature handle unusual situations?
- Verify documentation: Does documentation match implementation?
For “System Operational” Claims:
- Health checks: Is system actually running and responsive?
- Load testing: Does system perform under expected load?
- Failure testing: How does system handle failures?
- Monitoring verification: Are monitoring systems actually catching issues?
- Backup validation: Are backups actually restorable?
For “Memory/Progress” Claims:
- Verify completion: Is claimed work actually complete?
- Check quality: Is completed work production-ready?
- Validate dependencies: Do dependencies actually exist and work?
- Review implementation: Does implementation match design?
- Test deliverables: Do deliverables actually solve the problem?
Examples
# Verify test results claims
npm run verify:test-results -- --test-suite "user-authentication"
npm run verify:test-coverage -- --module "payment-processing"
# Verify feature claims
npm run verify:feature -- --feature "checkout-flow" --claim "handles 1000 concurrent users"
npm run verify:feature -- --feature "report-generation" --claim "exports to PDF format"
# Verify system operational claims
npm run verify:system-health -- --component "database" --claim "redundant and fault-tolerant"
npm run verify:system-performance -- --endpoint "/api/orders" --claim "response < 200ms"
# Verify progress/memory claims
npm run verify:progress -- --task "implement-payment-webhook" --claim "completed and tested"
npm run verify:documentation -- --section "api-reference" --claim "accurately describes endpoints"
# Comprehensive verification
npm run verify:all-claims # Verify all identified claims
npm run verify:critical-claims # Verify only critical claims
npm run verify:stale-claims # Verify claims not recently checked
# Integration with other testing
npm run verify:with -- --test-type chaos --claim "system-resilient"
npm run verify:with -- --test-type security --claim "no-vulnerabilities"
npm run verify:with -- --test-type usability --claim "user-friendly"
Output format
Trust But Verify Report
ââââââââââââââââââââââââââââââ
Verification Context: Pre-production release validation
Total Claims Identified: 47
Claims Verified: 23 (priority order)
Verification Duration: 2 hours 15 minutes
Critical Claim Verification Results:
1. Claim: "Payment processing tests pass with 100% coverage"
Source: CI/CD pipeline report
Verification Strategy: Independent test execution + coverage analysis
Result: â DISCREPANCY FOUND
- Tests pass but don't validate currency conversion rates
- Coverage shows 100% but misses error handling paths
- Test data uses only USD, missing other currencies
Recommendation: Add currency conversion tests, expand test data
2. Claim: "System handles 5000 concurrent users"
Source: Performance test report from 2 weeks ago
Verification Strategy: Fresh load test with different patterns
Result: â ï¸ PARTIALLY VERIFIED
- System handles 5000 users but response time degrades by 300%
- Database connection pool exhausted at 4500 users
- CPU usage reaches 95% at target load
Recommendation: Optimize database connections, add autoscaling
3. Claim: "User registration feature complete"
Source: Project management system
Verification Strategy: End-to-end testing + security review
Result: â
VERIFIED
- Registration flow works correctly
- Email verification functional
- Password security requirements enforced
- No security vulnerabilities found
4. Claim: "Monitoring alerts configured for all critical errors"
Source: DevOps runbook
Verification Strategy: Error injection + alert monitoring
Result: â DISCREPANCY FOUND
- Database connection errors not alerting
- Payment gateway timeouts not monitored
- Alert thresholds too high for business impact
Recommendation: Review and update alert configuration
5. Claim: "Backup system tested and functional"
Source: System documentation
Verification Strategy: Actual backup restore test
Result: â ï¸ PARTIALLY VERIFIED
- Backup creation works
- Restore process documented but untested
- Restore time exceeds RTO (Recovery Time Objective)
Recommendation: Test full restore, optimize restore process
Verification Confidence Assessment:
- High Confidence: 8 claims (thoroughly verified)
- Medium Confidence: 10 claims (partially verified)
- Low Confidence: 5 claims (insufficient verification)
- Failed Verification: 5 claims (discrepancies found)
Critical Issues Requiring Attention:
1. Payment currency conversion untested (business risk: high)
2. Database connection pool limits scalability (performance risk: high)
3. Missing critical error alerts (operational risk: medium)
4. Backup restore untested (recovery risk: medium)
Verification Effectiveness:
- False positives prevented: 3 (would have caused production issues)
- Assumptions challenged: 12 (revealed hidden risks)
- Verification time vs value: High ROI (2 hours prevented days of issues)
- Recommendations generated: 7 actionable improvements
Next Steps:
1. Address critical discrepancies before release
2. Improve test coverage for payment processing
3. Optimize database connection management
4. Update monitoring and alert configuration
5. Schedule regular verification for high-risk claims
Notes
- Trust but verify is a mindset, not just a technical process
- Balance verification effort with risk and criticality
- Document verification methods and results for audit trails
- Use verification findings to improve original testing and claims
- Consider verification as ongoing process, not one-time event
- Involve different perspectives in verification (fresh eyes see different things)
- Measure verification effectiveness over time
- Share verification findings transparently with stakeholders
- Use verification to build system understanding, not just find faults
- Adapt verification strategies based on what you learn
- Remember: absence of evidence is not evidence of absence