nlm-index

📁 nghyane/opencode-plugin-notebooklm 📅 9 days ago

总安装量

周安装量

#53022

全站排名

安装命令

npx skills add https://github.com/nghyane/opencode-plugin-notebooklm --skill nlm-index

Agent 安装分布

opencode 1

codex 1

claude-code 1

Skill 文档

NotebookLM Index

Workflow to scrape docs/repos and upload to NotebookLM for AI-powered research.

Use Cases

Index entire documentation site (React, Next.js, etc.)
Index GitHub repo (README, docs, source files)
Bulk upload YouTube video transcripts

Workflow

1. Identify Target

User provides:
- Docs URL: "https://react.dev/reference/react"
- GitHub repo: "vercel/ai" or "https://github.com/vercel/ai"
- YouTube playlist/channel

2. Create or Select Notebook

notebook_create({ title: "React Docs" })
# or
notebook_list()  # select existing

3. Discover URLs

Option A: Documentation Site

# Use webfetch to get sitemap or crawl links
webfetch({ url: "https://react.dev/sitemap.xml", format: "text" })

# Or scrape navigation links from docs page
webfetch({ url: "https://react.dev/reference/react", format: "markdown" })
# Extract all internal links from the page

Option B: GitHub Repo

# Use gh CLI to list files (quote URL to prevent shell glob expansion)
gh api 'repos/vercel/ai/git/trees/main?recursive=1' --jq '.tree[].path'

# Filter for docs/README
# Common patterns: README.md, docs/**, *.md, src/**/*.ts

Option C: YouTube

# Collect video URLs from playlist or channel
# Each video URL can be added directly

4. Filter & Prioritize

Keep:

Documentation pages (guides, API refs, tutorials)
README files
Source code with good comments
YouTube videos with transcripts

Skip:

Asset files (.png, .css, .js bundles)
Generated/minified code
node_modules, dist, build
Paid/private content

Limits:

Max 50 sources per notebook (NotebookLM limit)
If >50, split into multiple notebooks: “React Docs (Part 1)”, “(Part 2)”

5. Batch Upload

# Collect URLs (space or newline separated)
source_add({
  urls: """
    https://react.dev/reference/react/useState
    https://react.dev/reference/react/useEffect
    https://react.dev/reference/react/useContext
    https://react.dev/learn/thinking-in-react
  """,
  notebook_id: "..."
})

Rate Limiting:

NotebookLM processes URLs async
For large batches (20+ URLs), split into chunks of 10-15
Wait a few seconds between batches

6. Verify & Report

notebook_get({ notebook_id: "...", include_summary: true })

Report:

Total sources added
Any failed URLs (paid content, 404s, etc.)
Suggest next steps (query, generate audio, etc.)

Examples

Index React Hooks Docs

1. notebook_create({ title: "React Hooks Reference" })

2. Scrape https://react.dev/reference/react/hooks
   Extract: useState, useEffect, useContext, useReducer, etc.

3. source_add({
     urls: "https://react.dev/reference/react/useState https://react.dev/reference/react/useEffect ..."
   })

4. notebook_query({ query: "Summarize all hooks and their use cases" })

Index GitHub Repo

1. notebook_create({ title: "Vercel AI SDK" })

2. gh api 'repos/vercel/ai/git/trees/main?recursive=1'
   Filter: README.md, docs/**, packages/**/README.md

3. For each doc file:
   - If URL accessible: source_add({ urls: "https://github.com/vercel/ai/blob/main/README.md" })
   - If raw content needed: webfetch + source_add({ text: content, title: filename })

4. notebook_query({ query: "How do I use the AI SDK with Next.js?" })

Index YouTube Playlist

1. notebook_create({ title: "React Conf 2024" })

2. Collect video URLs from playlist

3. source_add({
     urls: """
       https://youtube.com/watch?v=xxx
       https://youtube.com/watch?v=yyy
       https://youtube.com/watch?v=zzz
     """
   })

4. studio_create({ type: "audio", focus_prompt: "Key announcements" })

Tips

Sitemap first: Most doc sites have /sitemap.xml – parse it for all URLs
GitHub raw URLs: Use raw.githubusercontent.com for direct file content
YouTube limits: Only public videos with captions work
Chunking: For 100+ URLs, create multiple notebooks by topic
Verification: Always check notebook_get after bulk upload to confirm sources added

Constraints

Constraint	Limit
Sources per notebook	~50
URL types	Public websites, YouTube
Content	Visible text only (no JS-rendered)
YouTube	Public videos with transcripts

GitHub 仓库 ↗ ← 返回陌讯 Skills 聚合平台