ebook-analysis

📁 jwynia/agent-skills 📅 Jan 20, 2026

总安装量

周安装量

#5144

全站排名

安装命令

npx skills add https://github.com/jwynia/agent-skills --skill ebook-analysis

Agent 安装分布

claude-code 34

opencode 30

gemini-cli 26

codex 25

antigravity 23

cursor 22

Skill 文档

Ebook Analysis: Non-Fiction Knowledge Extraction

You analyze ebooks to extract knowledge with full citation traceability. This skill supports two complementary extraction modes:

Concept Extraction – Extract ideas classified by abstraction (principle â tactic)
Entity Extraction – Extract named things (studies, researchers, frameworks, anecdotes) that persist across books

Core Principle

Every extraction must be traceable to its exact source. Citation traceability is non-negotiable. Extract less with full provenance rather than more without it.

Two Extraction Modes

Mode 1: Concept Extraction

For extracting IDEAS organized by abstraction level.

Use when: Analyzing a book for transferable ideas, building a concept taxonomy, understanding how abstract principles relate to concrete tactics.

Output: JSON files (analysis.json, concepts.json)

Example: “Spaced repetition improves retention” is a MECHANISM at Layer 2.

Mode 2: Entity Extraction

For extracting NAMED THINGS that can be cross-referenced across books.

Use when: Building a knowledge base where the same study, researcher, or framework appears in multiple books. The goal is entity resolutionârecognizing that “Hogarth’s framework” in Range is the same as “kind/wicked environments” mentioned elsewhere.

Output: Markdown files in knowledge base structure

Example: “Kind vs Wicked Environments” is a FRAMEWORK by Robin Hogarth.

Choosing a Mode

If you want to…	Use Mode
Understand a book’s argument structure	Concept Extraction
Build a reference library across books	Entity Extraction
Create actionable takeaways	Concept Extraction
Track what researchers say across sources	Entity Extraction
Both	Run both modes sequentially

Entity Extraction Mode (Detailed)

Entity Types

Type	What It Captures	Example
study	Research findings, experiments, data	Flynn Effect, Marshmallow Test
researcher	People and their contributions	Anders Ericsson, Robin Hogarth
framework	Mental models, taxonomies, systems	Kind vs Wicked, Desirable Difficulties
anecdote	Stories used to illustrate points	Tiger vs Roger, Challenger Disaster
concept	Ideas that aren’t frameworks	Cognitive entrenchment, Match quality

Extended Entity Type Guidance

Some entities don’t fit cleanly into the five types. Guidelines:

Entity Kind	Use Type	Rationale
Simulations/Games (Superstruct, EVOKE)	anecdote	Illustrative events, even if hypothetical
Institutions (IFTF, WEF)	researcher	Organizations contribute ideas like individuals
Historical events (Challenger disaster)	anecdote	Stories that illustrate principles
Hypothetical scenarios	anecdote	Future scenarios from books like Imaginable
Thought experiments	framework	If systematic; otherwise concept

When uncertain: Default to anecdote for narratives/events, concept for ideas, framework for systematic methods.

Author-as-Subject Pattern

When the book’s author is also a significant entity (e.g., Jane McGonigal in Imaginable):

Create a researcher entity if:

Author has notable prior work or institutional affiliation
Author appears in Wikipedia or other reference sources
Author’s background/credentials are relevant to understanding the book
Other books in your collection might reference them

Skip if:

Author is primarily known only for this book
No external sources to verify/enrich the entity

Template addition for author-subjects:

## Note
This researcher is the author of [Book] in our collection. Their frameworks and concepts are documented separately.

Entity File Template

# [Entity Name]
**Type:** study | researcher | framework | anecdote | concept
**Status:** stub | partial | solid | authoritative
**Last Updated:** YYYY-MM-DD
**Aliases:** alias1, alias2, alias3

## Summary
[2-3 sentence synthesized understanding]

## Key Findings / What It Illustrates
1. [Claim or finding with source]
   â Source: [Book], Ch.[X]

2. [Another claim]
   â Source: [Book], Ch.[X]

## Key Quotes
> "Quotable text here."

> "Another memorable quote."

## Sources in Collection
| Book | Author | How It's Used | Citation |
|------|--------|---------------|----------|
| Range | Epstein | [Role in book] | Ch.X |

## Sources NOT in Collection
- [Book that would enrich this entity]

## Related Entities
- [Other Entity](../type/other-entity.md) - Relationship description

## Open Questions
- [What we don't yet know]

Knowledge Base Structure

/knowledge/
âââ _index.md                    # Master registry
âââ _entities.json               # Searchable index (generated)
â
âââ nonfiction/
â   âââ _index.md                # Domain index
â   âââ _[book]-quotes.md        # Book-specific quotes file
â   âââ studies/
â   â   âââ flynn-effect.md
â   â   âââ chase-simon-chunking.md
â   âââ researchers/
â   â   âââ hogarth-robin.md
â   â   âââ tetlock-philip.md
â   âââ frameworks/
â   â   âââ kind-vs-wicked-environments.md
â   â   âââ desirable-difficulties.md
â   âââ anecdotes/
â   â   âââ tiger-vs-roger.md
â   â   âââ challenger-disaster.md
â   âââ concepts/
â       âââ cognitive-entrenchment.md
â       âââ match-quality.md
â
âââ cooking/                     # Domain-specific structure
â   âââ techniques/
â   âââ ingredients/
â   âââ equipment/
â
âââ technical/
    âââ patterns/
    âââ technologies/

Quotes Extraction

Quotable quotes are a distinct extraction type. For each book, create a quotes file:

File: _[book-slug]-quotes.md

Structure:

# Quotable Quotes from [Book Title]
**Author:** [Author]
**Last Updated:** YYYY-MM-DD

## On [Theme 1]
> "Quote text here."

> "Another quote on same theme."

## On [Theme 2]
> "Quote on different theme."

What makes a good quote:

Memorable phrasing that captures a key insight
Self-contained (understandable without context)
Surprising or counterintuitive formulation
Useful for presentations, writing, or reference

Entity Extraction Workflow

Scan book – Read through identifying named studies, researchers, frameworks, illustrative stories
Check existing entities – Use kb-resolve-entity.ts to see if entity already exists
Create or update – New entity â create file; existing â add as source
Add quotes – Extract memorable quotes to quotes file
Cross-link – Add Related Entities sections
Regenerate index – Run kb-generate-index.ts

Entity Extraction States (KB0-KB5)

State	Symptoms	Intervention
KB0	No knowledge base	Create directory structure
KB1	Structure exists, no entities	Begin extraction
KB2	Extracting from book	Create entity files
KB3	Entities created, not linked	Add Related Entities
KB4	Linked, no index	Run kb-generate-index.ts
KB5	Complete for this book	Proceed to next book

Cross-Book Synthesis Workflow

Triggered when: 2+ books have been extracted to the knowledge base.

Goals:

Find entities that appear in multiple books
Identify conceptual connections between books
Surface contradictions or complementary perspectives
Update entity files with multi-source synthesis

Process:

Entity overlap detection

# Find entities with 2+ sources
grep -l "Sources in Collection" knowledge/nonfiction/**/*.md | \
  xargs grep -l "| .* | .* |" | head -20

Or manually review entities updated with new source.

Conceptual connection mapping
- Compare frameworks across books (e.g., Range’s “wicked environments” â Imaginable’s “futures thinking”)
- Identify shared researchers (e.g., Tetlock appears in both Range and Imaginable)
- Look for complementary themes (prediction failure â preparation despite uncertainty)

Synthesis documentation For entities appearing in 2+ books, update the Summary section:

## Summary
[Synthesized understanding from BOTH sources, noting agreements and differences]

Cross-book insights Document thematic connections in context/insights/cross-book-{theme}.md:

# Cross-Book Insight: [Theme]

## Books Contributing
- Range (Epstein) - [perspective]
- Imaginable (McGonigal) - [perspective]

## Synthesis
[How the books complement or contradict each other]

Concept Extraction Mode (Detailed)

Concept Types (Abstract â Concrete)

Type	Definition	Example
Principle	Foundational truth or axiom	“Communities form around shared identity”
Mechanism	How something works	“Reciprocity creates social bonds”
Pattern	Recurring structure or approach	“The community lifecycle pattern”
Strategy	High-level approach to achieve goals	“Build trust before asking for contribution”
Tactic	Specific actionable technique	“Send welcome emails within 24 hours”

Abstraction Layers

Layer	Name	Abstraction	Example
0	Foundational	Universal principles	“Humans seek belonging”
1	Theoretical	Domain-specific theory	“Community requires shared purpose”
2	Strategic	Approaches and frameworks	“The funnel model of engagement”
3	Tactical	Specific methods	“Onboarding sequences”
4	Specific	Concrete implementations	“Use Discourse for forums”

Relationship Types

Relationship	Meaning	When to Use
INFLUENCES	A affects B	Causal or correlational connection
SUPPORTS	A provides evidence for B	Citation, example, validation
CONTRADICTS	A conflicts with B	Opposing claims
COMPOSED_OF	A contains B	Part-whole relationships
DERIVES_FROM	A is derived from B	Logical conclusions

Concept Extraction States (EA0-EA7)

State	Symptoms	Intervention
EA0	No input file	Guide file preparation
EA1	Raw file, not parsed	Run ea-parse.ts
EA2	Parsed, not extracted	LLM extracts concepts
EA3	Extracted, not classified	Assign types and layers
EA4	Classified, not annotated	Add themes, relationships
EA5	Single book complete	Export or proceed to synthesis
EA6	Multi-book ready	Cross-book synthesis
EA7	Analysis complete	Generate reports

Concept Extraction Workflow

Parse – Run ea-parse.ts to chunk book with position tracking
Extract – Present chunks to LLM for concept identification with exact quotes
Classify – Assign type (principleâtactic) and layer (0-4)
Annotate – Add themes and functional analysis
Link – Connect related concepts
Export – Generate analysis.json, concepts.json, report.md

Available Tools

Parsing Tools

ea-parse.ts

Parse ebook files into chunks with metadata and position tracking.

deno run --allow-read scripts/ea-parse.ts path/to/book.txt
deno run --allow-read scripts/ea-parse.ts path/to/book.epub --format epub
deno run --allow-read scripts/ea-parse.ts book.txt --chunk-size 1500 --overlap 150

Output: JSON with metadata, chapters (if detected), and chunks with positions.

Knowledge Base Tools

kb-generate-index.ts

Scan knowledge base and generate searchable entity index.

deno run --allow-read --allow-write scripts/kb-generate-index.ts /path/to/knowledge

Output: Creates _entities.json with all entities, aliases, and metadata.

kb-resolve-entity.ts

Search for existing entities before creating duplicates.

deno run --allow-read scripts/kb-resolve-entity.ts "Flynn Effect"
deno run --allow-read scripts/kb-resolve-entity.ts "Hogarth" --threshold 0.5
deno run --allow-read scripts/kb-resolve-entity.ts "kind learning" --json

Options:

--threshold <0-1> – Minimum match score (default: 0.3)
--limit <n> – Maximum results (default: 5)
--json – Output as JSON

Validation Tools

ea-validate.ts

Validate analysis output for citation accuracy and schema completeness.

deno run --allow-read scripts/ea-validate.ts analysis.json --report

Anti-Patterns

The Extraction Flood

Pattern: Extracting every potentially interesting phrase. Fix: Ask “Would I cite this?” before extracting. Quality over quantity.

The Citation Black Hole

Pattern: Extracting without preserving exact quotes or positions. Fix: Always capture: exact quote, chapter reference, context.

The Duplicate Entity

Pattern: Creating new entity without checking if it exists. Fix: Always run kb-resolve-entity.ts first.

The Orphan Entity

Pattern: Entities without Related Entities links. Fix: Every entity should connect to at least 2 others.

The Quote-Free Entity

Pattern: Entity captures ideas but no memorable phrasing. Fix: Include Key Quotes section with author’s exact words.

The Single-Book Silo

Pattern: Analyzing books without cross-referencing. Fix: After 2+ books, run synthesis to find connections.

Example Workflows

Full Entity Extraction (Range Example)

1. Scan book chapter by chapter
2. Identify all named studies, researchers, frameworks, anecdotes
3. Create inventory document listing all potential entities
4. For each entity:
   a. kb-resolve-entity.ts "[entity name]" to check existence
   b. Create markdown file in appropriate type directory
   c. Fill in template with findings and citations
   d. Add Key Quotes section
5. Create _range-quotes.md with all memorable quotes
6. Update _index.md with new entities
7. kb-generate-index.ts to rebuild _entities.json

Quick Concept Scan

1. ea-parse.ts book.txt --chunk-size 2000
2. For each chunk, extract top 3-5 concepts
3. Classify by type and layer
4. Generate concepts.json and report.md

Output Persistence

Entity Extraction Output

File	Location
Entity files	`knowledge/{domain}/{type}/{entity-slug}.md`
Quotes file	`knowledge/{domain}/_[book]-quotes.md`
Entity index	`knowledge/_entities.json`
Domain index	`knowledge/{domain}/_index.md`

Concept Extraction Output

File	Location
Full analysis	`ebook-analysis/{author}-{title}/analysis.json`
Concepts only	`ebook-analysis/{author}-{title}/concepts.json`
Citations	`ebook-analysis/{author}-{title}/citations.json`
Report	`ebook-analysis/{author}-{title}/report.md`

Verification (Oracle)

What This Skill Can Verify

Citation positions exist – Validate quoted text appears at claimed position
Schema completeness – Required fields present
Cross-reference integrity – Referenced entities exist
Duplicate detection – Entity doesn’t already exist (via kb-resolve-entity.ts)

What Requires Human Judgment

Significance – Is this worth extracting?
Classification – Is this really a “framework” vs “concept”?
Relationship validity – Does A really influence B?
Quote quality – Is this actually memorable?

Integration Graph

Inbound (From Other Skills)

Source	Leads to
research	Multi-book synthesis ready
reverse-outliner	Structural data for concept extraction

Outbound (To Other Skills)

From State	Leads to
Entity extraction complete	dna-extraction (deep functional analysis)
Concept extraction complete	media-meta-analysis (cross-source synthesis)

Complementary Skills

Skill	Relationship
dna-extraction	6-axis functional analysis for annotation
reverse-outliner	Structural approach for fiction
voice-analysis	Author style fingerprinting
context-network	Knowledge base maintenance

Calibration Data (from Range + Imaginable extractions)

By Book Density

Book Type	Expected Entities	Estimated Effort
Dense non-fiction (Range, Thinking Fast & Slow)	60-100	4-6 hours
Moderate non-fiction (most business books)	30-50	2-3 hours
Light non-fiction (popular science)	15-30	1-2 hours
Technical books	20-40	2-3 hours

By Book Subtype

Different non-fiction subtypes yield different entity profiles:

Subtype	Example	Entity Profile	Expected Count
Research synthesis	Range	Many studies, researchers, frameworks	60-100
Methodological/How-to	Imaginable	Many frameworks, few studies	30-50
Memoir/Narrative	Educated	Few frameworks, many anecdotes	20-40
Reference	Technical manuals	Many concepts, few anecdotes	Variable

Research synthesis books cite many studies and researchers, connecting ideas across domains. Methodological books teach techniques and frameworks but cite fewer external sources. Memoir/narrative books use personal stories to illustrate points rather than research.

Metadata Reliability Warning

Book classification metadata (Calibre tags, library categories) is often:

Wrong – Fiction/non-fiction misclassified
Generic – “General Fiction” or “Self-Help” applied broadly
Inconsistent – Same book categorized differently across sources

Always verify classification makes sense before extraction. A “fiction” tag on a methodology book like Imaginable is a metadata error.

Reasoning Requirements

Standard Reasoning

Single chunk concept extraction
Type/layer classification
Simple relationship identification
Individual entity creation

Extended Reasoning (ultrathink)

Use extended thinking for:

Multi-book synthesis – requires holding multiple networks simultaneously
Contradiction detection – semantic comparison across sources
Theme emergence – identifying patterns across large sets
Knowledge gap identification – reasoning about what’s missing

Trigger phrases: “synthesize across books”, “find contradictions”, “identify gaps”, “comprehensive analysis”

What You Do NOT Do

Extract without citation traceability
Create entities without checking for duplicates
Skip the linking phase (orphan entities are not useful)
Leave entities without quotes
Treat fiction as non-fiction
Use regex for semantic analysis (LLM judgment only)

GitHub 仓库 ↗ ← 返回陌讯 Skills 聚合平台