token-saver-context-compression

📁 oimiragieo/agent-studio 📅 10 days ago

总安装量

周安装量

#13647

全站排名

安装命令

npx skills add https://github.com/oimiragieo/agent-studio --skill token-saver-context-compression

Agent 安装分布

github-copilot 27

gemini-cli 27

cursor 27

amp 26

codex 26

kimi-cli 26

Skill 文档

Token Saver Context Compression

Use this skill to reduce token usage while preserving grounded evidence. This integrates:

pnpm search:code (hybrid retrieval)
token-saver Python compression scripts
MemoryRecord persistence into framework memory
spawn prompt evidence injection ([mem:*] / [rag:*])

When to Use

pnpm search:tokens shows a file/directory exceeds 32K tokens
Context is large or expensive and you need a compressed summary
You need query-targeted compression before synthesis
You need hard evidence sufficiency gating before persisting memory
You’re building a prompt and search:code results alone aren’t enough context

Iron Law

Do not persist compressed content directly to memory files from a subprocess. Emit MemoryRecord payloads and let framework hooks process sync/indexing.

Workflow

Retrieve candidate context (pnpm search:code "<query>").
Compress using token-saver in JSON mode (run_skill_workflow.py --output-format json).
If evidence is insufficient and fail gate is on, stop.
Map distilled insights into MemoryRecord-ready payloads.
Persist through MemoryRecord so .claude/hooks/memory/sync-memory-index.cjs runs.

Mapping Rule (Deterministic)

gotchas.json:
- text contains gotcha|pitfall|anti-pattern|risk|warning|failure
issues.md:
- text contains issue|bug|error|incident|defect|gap
decisions.md:
- text contains decision|tradeoff|choose|selected|rationale
patterns.json:
- default fallback for all remaining distilled evidence

Tooling Commands

Preferred wrapper entrypoint:

node .claude/skills/token-saver-context-compression/scripts/main.cjs --query "<question>" --mode evidence_aware --limit 20 --fail-on-insufficient-evidence

Direct Python engine (advanced):

python .claude/skills/token-saver-context-compression/scripts/run_skill_workflow.py --file <path> --mode evidence_aware --query "<question>" --output-format json --fail-on-insufficient-evidence

Output Contract

Wrapper emits JSON with:
- search summary
- compression summary
- memoryRecords grouped by target (patterns, gotchas, issues, decisions)
- evidence sufficiency status

Workflow References

Skill workflow: .claude/workflows/token-saver-context-compression-skill-workflow.md
Companion tool: .claude/tools/token-saver-context-compression/token-saver-context-compression.cjs
Command surface: .claude/skills/token-saver-context-compression/commands/token-saver-context-compression.md
Citation format is unchanged:
- memory entries become [mem:xxxxxxxx]
- RAG entries remain [rag:xxxxxxxx]

Integration with search:tokens

Use pnpm search:tokens to decide when to invoke this skill:

# Check if you need compression
pnpm search:tokens .claude/lib/memory
# Output: 60 files, 500KB, ~128K tokens â  OVER CONTEXT

# Then compress with a targeted query
node .claude/skills/token-saver-context-compression/scripts/main.cjs \
  --query "how does memory persistence work" --mode evidence_aware --limit 10

The tool reads actual file content from search results (not just file paths), compresses via the Python engine, and extracts memory records classified by type (patterns, gotchas, issues, decisions).

Adaptive Compression

Adaptive compression (adjusting compression ratio based on corpus size) is automatic and requires no env var configuration. When the input corpus is small, compression is lighter; when it is large, compression is more aggressive. This is controlled internally by the Python engine based on token counts.

Requirements

Node.js 18+
Python 3.10+

Iron Laws

ALWAYS run hybrid search (pnpm search:code) before compressing to retrieve grounded evidence for the distilled output
NEVER compress context that still has open uncertainties â resolve ambiguities before compressing
ALWAYS persist distilled learnings via MemoryRecord immediately after compression
NEVER discard evidence that contradicts the current working hypothesis during compression
ALWAYS inject [mem:*] and [rag:*] citations in the compressed output for downstream spawn prompt grounding

Anti-Patterns

Anti-Pattern	Why It Fails	Correct Approach
Compressing without prior hybrid search	Output lacks grounded evidence, hallucination risk	Run `pnpm search:code` first, embed citations
Discarding contradicting evidence	Creates false confidence in distilled output	Preserve all conflicting signals in summary
No MemoryRecord after compression	Learnings lost on next context reset	Persist key findings immediately via MemoryRecord
Compressing too late (past 80K tokens)	Severe accuracy degradation before compression	Trigger compression at 80K tokens, not at limit
Skipping `[mem:]` / `[rag:]` citations	Downstream agents cannot verify claims	Always annotate evidence sources in output

Memory Protocol (MANDATORY)

Before work:

cat .claude/context/memory/learnings.md

After work:

Add integration learnings to .claude/context/memory/learnings.md
Add integration risks to .claude/context/memory/issues.md

GitHub 仓库 ↗ ← 返回陌讯 Skills 聚合平台