system2-attention

📁 plurigrid/asi 📅 Jan 29, 2026

总安装量

周安装量

#41594

全站排名

安装命令

npx skills add https://github.com/plurigrid/asi --skill system2-attention

Agent 安装分布

codex 1

claude-code 1

Skill 文档

System 2 Attention Skill: Deliberate Reasoning Validation

Status: â Production Ready Trit: -1 (MINUS – validator/constraint) Color: #2626D8 (Blue) Principle: Filter noise via deliberate re-attention Frame: Two-stage attention with explicit reasoning

Overview

System 2 Attention (S2A) validates and filters transformer attention by regenerating context deliberately. Standard attention (System 1) is fast but susceptible to sycophancy and irrelevant context. S2A re-attends after explicit reasoning.

Context regeneration: LLM rewrites context removing irrelevant info
Two-pass attention: Fast then deliberate
Sycophancy reduction: Filter opinion-seeking noise
Factual grounding: Anchor to verified facts

Core Pattern

S2A(x, context):
  # System 1: fast pattern matching
  context_filtered = LLM("Extract only relevant facts from: {context}")
  
  # System 2: deliberate reasoning on clean context
  return LLM(x, context=context_filtered)

def system2_attention(query: str, context: str, model) -> str:
    # Stage 1: Regenerate context (remove sycophantic/irrelevant)
    filter_prompt = f"""Given the context below, extract only the 
    objective facts relevant to answering questions. Remove opinions,
    leading questions, and irrelevant details.
    
    Context: {context}
    
    Relevant facts only:"""
    
    clean_context = model.generate(filter_prompt)
    
    # Stage 2: Answer with filtered context
    return model.generate(query, context=clean_context)

Key Concepts

1. Context Filtering

class S2AFilter:
    def __init__(self, model):
        self.model = model
    
    def filter_sycophancy(self, context: str) -> str:
        """Remove opinion-seeking and leading content."""
        return self.model.generate(
            f"Rewrite removing any opinions or leading questions:\n{context}"
        )
    
    def filter_irrelevant(self, context: str, query: str) -> str:
        """Keep only query-relevant facts."""
        return self.model.generate(
            f"Extract facts from context relevant to: {query}\n\n{context}"
        )

2. Two-Pass Architecture

class System2AttentionLayer:
    def __init__(self, base_attention, filter_model):
        self.attn = base_attention
        self.filter = filter_model
    
    def forward(self, q, k, v, context_mask=None):
        # Pass 1: Standard attention (System 1)
        attn_weights = self.attn(q, k, v)
        
        # Identify high-entropy (uncertain) positions
        entropy = -torch.sum(attn_weights * torch.log(attn_weights + 1e-9), dim=-1)
        uncertain = entropy > self.threshold
        
        # Pass 2: Deliberate re-attention on uncertain positions
        if uncertain.any():
            filtered_kv = self.filter(k, v, uncertain)
            attn_weights[uncertain] = self.attn(q[uncertain], filtered_kv)
        
        return attn_weights

3. Factual Grounding Validator

def validate_factual_grounding(response: str, facts: list[str]) -> float:
    """Score response grounding in verified facts."""
    claims = extract_claims(response)
    grounded = sum(1 for c in claims if any(entails(f, c) for f in facts))
    return grounded / len(claims) if claims else 1.0

Commands

# Apply S2A filtering
just s2a-filter context.txt query.txt

# Measure sycophancy reduction
just s2a-sycophancy-test model responses/

# Validate factual grounding
just s2a-grounding response.txt facts.txt

Integration with GF(3) Triads

system2-attention (-1) â causal-inference (0) â gflownet (+1) = 0 â  [Deliberate Search]
system2-attention (-1) â cognitive-superposition (0) â forward-forward-learning (+1) = 0 â  [Local Validation]

Related Skills

causal-inference (0): Coordinate causal reasoning
forward-forward-learning (+1): Generate local learning signals
proofgeneral-narya (-1): Formal verification baseline

Skill Name: system2-attention Type: Deliberate Reasoning Validator Trit: -1 (MINUS) Color: #2626D8 (Blue)

Scientific Skill Interleaving

This skill connects to the K-Dense-AI/claude-scientific-skills ecosystem:

Graph Theory

networkx [â] via bicomodule
- Universal graph hub

Bibliography References

general: 734 citations in bib.duckdb

SDF Interleaving

This skill connects to Software Design for Flexibility (Hanson & Sussman, 2021):

Primary Chapter: 4. Pattern Matching

Concepts: unification, match, segment variables, pattern

GF(3) Balanced Triad

system2-attention (â) + SDF.Ch4 (+) + [balancer] (â) = 0

Skill Trit: 0 (ERGODIC – coordination)

Secondary Chapters

Ch6: Layering
Ch7: Propagators

Connection Pattern

Pattern matching extracts structure. This skill recognizes and transforms patterns.

Cat# Integration

This skill maps to Cat# = Comod(P) as a bicomodule in the equipment structure:

Trit: 0 (ERGODIC)
Home: Prof
Poly Op: â
Kan Role: Adj
Color: #26D826

GF(3) Naturality

The skill participates in triads satisfying:

(-1) + (0) + (+1) â¡ 0 (mod 3)

This ensures compositional coherence in the Cat# equipment structure.

GitHub 仓库 ↗ ← 返回陌讯 Skills 聚合平台