playwright-local

📁 olino3/forge 📅 13 days ago

总安装量

周安装量

#52897

全站排名

安装命令

npx skills add https://github.com/olino3/forge --skill playwright-local

Agent 安装分布

cursor 4

claude-code 4

replit 4

mcpjam 3

openhands 3

zencoder 3

Skill 文档

skill:playwright-local – Browser Automation & Web Scraping with Playwright

Version: 1.0.0

Purpose

Build browser automation and web scraping scripts using Playwright on the local machine. Supports all three browser engines (Chromium, Firefox, WebKit) in headless or headed mode. Covers page navigation, element interaction, form filling, screenshot capture, PDF generation, network interception, authentication flows, file upload/download, iframe handling, and multi-page workflows. Use when you need to automate browser tasks, scrape dynamic websites, capture visual snapshots, or build end-to-end test flows locally.

File Structure

skills/playwright-local/
âââ SKILL.md (this file)
âââ examples.md

Interface References

Context: Loaded via ContextProvider Interface
Memory: Accessed via MemoryStore Interface
Shared Patterns: Shared Loading Patterns
Schemas: Validated against context_metadata.schema.json and memory_entry.schema.json

Mandatory Workflow

IMPORTANT: Execute ALL steps in order. Do not skip any step.

Step 1: Initial Analysis

Determine automation goal: scraping, testing, form filling, screenshots/PDF, or monitoring
Identify target site(s) and pages to automate
Determine browser preference: Chromium (default), Firefox, or WebKit
Decide headless vs headed mode (headed for debugging, headless for CI/production)
Assess complexity: single page, multi-step flow, or multi-tab workflow
Check if authentication is required (cookies, localStorage, login forms)
Identify data extraction needs: text content, structured data, images, files

Step 2: Load Memory

Follow Standard Memory Loading with skill="playwright-local" and domain="engineering".

Step 3: Load Context

Follow Standard Context Loading for the engineering domain. Stay within the file budget declared in frontmatter.

Step 4: Configure Playwright Setup

Generate project setup commands:

npm init -y
npm install playwright
npx playwright install chromium  # or firefox, webkit, or all

Configure TypeScript or JavaScript project as appropriate
Set browser launch options:
- headless: true/false
- slowMo: milliseconds between actions (for debugging)
- args: browser arguments (e.g., --disable-gpu, --no-sandbox)
Set browser context options:
- viewport: width and height (default: 1280Ã720)
- locale: language/region (e.g., en-US)
- permissions: geolocation, notifications, camera, microphone
- userAgent: custom user agent string
- storageState: path to saved auth state
Configure timeouts: navigation (30s default), action (5s), assertion (5s)

Step 5: Build Automation Script

Page navigation: page.goto() with wait strategies (load, domcontentloaded, networkidle)
Wait strategies: page.waitForSelector(), page.waitForLoadState(), page.waitForURL(), page.waitForResponse()
Element selectors (in priority order):
- Role selectors: page.getByRole('button', { name: 'Submit' })
- Text selectors: page.getByText('Welcome')
- Label selectors: page.getByLabel('Email')
- Placeholder selectors: page.getByPlaceholder('Enter email')
- Test ID selectors: page.getByTestId('submit-btn')
- CSS selectors: page.locator('.product-card')
- XPath selectors: page.locator('xpath=//div[@class="item"]')
Interaction sequences: click, fill, type, select, check, hover, drag-and-drop
Data extraction: textContent(), innerText(), getAttribute(), inputValue(), $$eval()
Error handling: try/catch blocks, screenshot on failure, graceful cleanup

Step 6: Handle Advanced Patterns

Implement as needed based on Step 1 analysis:

Authentication flows:
- Login form automation (fill credentials, submit, wait for redirect)
- Save auth state with context.storageState({ path: 'auth.json' })
- Reuse auth state in subsequent runs to skip login
- Cookie injection via context.addCookies()
- localStorage/sessionStorage manipulation via page.evaluate()
Network interception:
- Mock API responses with page.route() to return custom data
- Block resource types (images, fonts, analytics) to speed up scraping
- Capture API responses with page.on('response') for data extraction
- Modify request headers (auth tokens, custom headers)
File handling:
- File upload: page.setInputFiles() or fileChooser event
- File download: configure downloadsPath, wait for download event
- PDF generation: page.pdf({ path, format, printBackground })
Complex interactions:
- iframe handling: page.frameLocator() or frame() for cross-frame interaction
- Multi-tab workflows: context.on('page') to capture new tabs/popups
- Shadow DOM: page.locator() pierces shadow DOM by default
- Drag and drop: page.dragAndDrop(source, target)

Step 7: Add Error Prevention

CRITICAL: Implement these 10 common error preventions to ensure robust automation.

Stale element handling â Always re-query elements before interaction. Use Playwright’s auto-waiting locators instead of storing element references. Playwright’s locator API auto-retries, but avoid caching ElementHandle objects.
Navigation timeouts â Set appropriate timeouts for page.goto() and page.waitForNavigation(). Default 30s is too short for slow sites; too long delays failure detection. Use page.goto(url, { timeout: 60000 }) for known slow pages.
Anti-bot detection avoidance â Rotate user agents, set realistic viewport sizes, add human-like delays with page.waitForTimeout() between actions, and avoid headless detection by using chromium.launch({ channel: 'chrome' }) for a real browser fingerprint.
Proper wait strategies â Never use hard-coded waitForTimeout() as the primary wait. Use event-driven waits: waitForSelector(), waitForLoadState('networkidle'), waitForResponse(), or waitForURL(). Hard waits are fragile and slow.
Screenshot on failure â Wrap automation in try/catch and capture page.screenshot({ path: 'error.png', fullPage: true }) on failure. This provides visual debugging context that logs alone cannot.
Retry logic â Implement retry wrappers for flaky operations (network requests, element interactions on dynamic pages). Use exponential backoff: 1s, 2s, 4s delays between retries, with a maximum of 3 attempts.
Graceful cleanup â Always close browser and context in a finally block. Leaked browser processes consume memory and can cause port conflicts. Use browser.close() even when the script fails.
Selector specificity â Avoid overly broad selectors like div or .item that match multiple elements. Prefer role-based, text-based, or test-id selectors. When using CSS, be specific enough to match exactly one element.
Race condition prevention â When clicking a button that triggers navigation, use Promise.all([page.waitForNavigation(), page.click()]) to avoid race conditions between the click and the navigation event listener registration.
Resource cleanup for long-running scripts â For scripts that process many pages, close and reopen contexts periodically to prevent memory leaks. Monitor page.on('crash') and page.on('pageerror') to detect and recover from browser crashes.

Step 8: Generate Output

Save output to /claudedocs/playwright-local_{project}_{YYYY-MM-DD}.md
Follow naming conventions in ../OUTPUT_CONVENTIONS.md
Output includes:
- Complete Playwright script (TypeScript or JavaScript)
- Setup instructions (install commands, browser downloads)
- Configuration files (tsconfig.json if TypeScript)
- Extracted data or screenshots (if applicable)
- Error handling and retry logic
- Run instructions (command to execute the script)

Step 9: Update Memory

Follow Standard Memory Update for skill="playwright-local". Store working selectors, authentication flows, and automation patterns for future sessions.

10 Common Errors to Prevent

Stale element handling â Re-query elements before interaction; use locators, not cached ElementHandles.
Navigation timeouts â Set appropriate timeouts; adjust for slow sites, don’t rely on defaults blindly.
Anti-bot detection avoidance â Use real browser channels, rotate user agents, add human-like delays.
Proper wait strategies â Use event-driven waits, not hard-coded waitForTimeout().
Screenshot on failure â Always capture screenshots in catch blocks for visual debugging.
Retry logic â Wrap flaky operations with exponential backoff retries (max 3 attempts).
Graceful cleanup â Close browser in finally block to prevent leaked processes.
Selector specificity â Use role/text/test-id selectors; avoid overly broad CSS selectors.
Race condition prevention â Use Promise.all() for click + navigation combinations.
Resource cleanup for long-running scripts â Periodically close/reopen contexts; monitor for crashes.

Compliance Checklist

Before completing, verify:

All mandatory workflow steps executed in order
Standard Memory Loading pattern followed (Step 2)
Standard Context Loading pattern followed (Step 3)
Browser install commands included in setup
Wait strategies use event-driven waits, not hard-coded timeouts
Error handling includes screenshot on failure
Graceful cleanup with browser.close() in finally block
All 10 error preventions reviewed and applied where relevant
Output saved with standard naming convention
Standard Memory Update pattern followed (Step 9)

Version History

Version	Date	Changes
1.0.0	2025-07-15	Initial release â navigation, interaction, scraping, screenshots, PDF, network interception, auth flows, 10 error preventions

GitHub 仓库 ↗ ← 返回陌讯 Skills 聚合平台