article-extractor
4
总安装量
3
周安装量
#51661
全站排名
安装命令
npx skills add https://github.com/jrajasekera/claude-skills --skill article-extractor
Agent 安装分布
openclaw
2
opencode
2
cursor
2
claude-code
2
Skill 文档
Article Extractor
Extract clean article content from URLs, removing ads, navigation, and clutter. Multi-tool fallback ensures reliability.
Workflow
When user provides a URL to download/extract:
- Call the extraction script directly with the URL (do NOT fetch the URL first with web_fetch)
- Script handles fetching, extraction, and saving automatically
- Returns clean markdown file with frontmatter
Usage
# Basic extraction
scripts/extract-article.sh "https://example.com/article"
# Specify output location
scripts/extract-article.sh "https://example.com/article" -o my-article.md -d ~/Documents
# Try Wayback Machine if original fails
scripts/extract-article.sh "https://example.com/article" --wayback
Make script executable if needed: chmod +x scripts/extract-article.sh
Key Options
-o <file>– Output filename-d <dir>– Output directory-w, --wayback– Try Wayback Machine if extraction fails-t <tool>– Force tool:jina,trafilatura,readability,fallback-q– Quiet mode
For complete options, exit codes, tool details, and examples, see references/tools-and-options.md.
Common Failures
- Exit 3 (access denied): Paywall or login required – try
--wayback - Exit 4 (no content): Heavy JavaScript – try different
--tool - Exit 2 (network): Connection issue – check URL
Local Tools (Optional)
For offline extraction: scripts/install-deps.sh