spa-reverse-engineer
0
总安装量
18
周安装量
安装命令
npx skills add https://github.com/pv-udpv/pplx-sdk --skill spa-reverse-engineer
Agent 安装分布
openclaw
17
gemini-cli
17
replit
17
antigravity
17
windsurf
17
claude-code
17
Skill 文档
SPA Reverse Engineering â React + Vite + Workbox + CDP
Reverse engineer modern SPAs to extract APIs, intercept service workers, debug runtime state, and build tooling.
When to use
Use this skill when:
- Analyzing perplexity.ai SPA internals (React component tree, state, hooks)
- Intercepting Workbox service worker caching and request strategies
- Using Chrome DevTools Protocol (CDP) to automate browser interactions
- Building Chrome extensions for traffic interception or state extraction
- Debugging Vite-bundled source maps and module graph
- Extracting GraphQL/REST schemas from SPA network layer
- Writing Puppeteer/Playwright scripts for automated API discovery
Instructions
Step 1: Identify SPA Stack
Detect the technology stack of the target SPA:
// In DevTools Console:
// React detection
window.__REACT_DEVTOOLS_GLOBAL_HOOK__ // React DevTools presence
document.querySelector('#__next') // Next.js
document.querySelector('#root') // Vite/CRA
document.querySelector('#app') // Vue (for comparison)
// Vite detection
document.querySelector('script[type="module"]') // ESM modules
// Check source for /@vite/client or /.vite/ paths
// Workbox / Service Worker
navigator.serviceWorker.getRegistrations() // List SWs
// Check Application â Service Workers in DevTools
// State management
window.__REDUX_DEVTOOLS_EXTENSION__ // Redux
// React DevTools â Components â hooks for Zustand/Jotai/Recoil
Step 2: React Internals Analysis
Component Tree Extraction
// Get React fiber tree from any DOM element
function getFiber(element) {
const key = Object.keys(element).find(k =>
k.startsWith('__reactFiber$') || k.startsWith('__reactInternalInstance$')
);
return element[key];
}
// Walk fiber tree
function walkFiber(fiber, depth = 0) {
if (!fiber) return;
const name = fiber.type?.displayName || fiber.type?.name || fiber.type;
if (typeof name === 'string') {
console.log(' '.repeat(depth) + name);
}
walkFiber(fiber.child, depth + 1);
walkFiber(fiber.sibling, depth);
}
// Start from root
const root = document.getElementById('root');
walkFiber(getFiber(root));
State & Props Extraction
// Extract component state via fiber
function getComponentState(fiber) {
const state = [];
let hook = fiber.memoizedState;
while (hook) {
state.push(hook.memoizedState);
hook = hook.next;
}
return state;
}
// Find specific component by name
function findComponent(fiber, name) {
if (!fiber) return null;
if (fiber.type?.name === name || fiber.type?.displayName === name) {
return fiber;
}
return findComponent(fiber.child, name) || findComponent(fiber.sibling, name);
}
Step 3: Vite Bundle Analysis
Source Map Extraction
# Find source maps from bundled assets
curl -s https://www.perplexity.ai/ | grep -oP 'src="[^"]*\.js"' | while read src; do
url=$(echo $src | grep -oP '"[^"]*"' | tr -d '"')
echo "Checking: $url"
curl -sI "https://www.perplexity.ai${url}.map" | head -5
done
Module Graph
// In Vite dev mode (if accessible):
// /__vite_module_graph shows dependency graph
// In production â analyze chunks:
// Performance â Network â JS files â Initiator chain
// Sources â Webpack/Vite tree â module paths
Step 4: Service Worker & Workbox Interception
Analyze Caching Strategy
// List all cached URLs
async function listCaches() {
const names = await caches.keys();
for (const name of names) {
const cache = await caches.open(name);
const keys = await cache.keys();
console.log(`Cache: ${name} (${keys.length} entries)`);
keys.forEach(k => console.log(` ${k.url}`));
}
}
// Intercept SW fetch events (from SW scope)
self.addEventListener('fetch', event => {
console.log('[SW Intercept]', event.request.method, event.request.url);
});
Workbox Strategy Detection
// Common Workbox strategies to look for in SW source:
// - CacheFirst â Static assets (fonts, images)
// - NetworkFirst â API calls (dynamic data)
// - StaleWhileRevalidate â Frequently updated content
// - NetworkOnly â Always fresh (auth endpoints)
// - CacheOnly â Offline-only content
// Check SW source for workbox patterns:
// workbox.strategies.CacheFirst
// workbox.routing.registerRoute
// workbox.precaching.precacheAndRoute
Step 5: Chrome DevTools Protocol (CDP)
Automated Interception via CDP
import asyncio
from playwright.async_api import async_playwright
async def intercept_with_cdp():
async with async_playwright() as p:
browser = await p.chromium.launch(headless=False)
context = await browser.new_context()
page = await context.new_page()
# Enable CDP domains
cdp = await page.context.new_cdp_session(page)
# Intercept network at CDP level
await cdp.send('Network.enable')
cdp.on('Network.requestWillBeSent', lambda params:
print(f"[CDP] {params['request']['method']} {params['request']['url']}")
)
cdp.on('Network.responseReceived', lambda params:
print(f"[CDP] {params['response']['status']} {params['response']['url']}")
)
# Intercept WebSocket frames
await cdp.send('Network.enable')
cdp.on('Network.webSocketFrameSent', lambda params:
print(f"[WSâ] {params['response']['payloadData'][:200]}")
)
cdp.on('Network.webSocketFrameReceived', lambda params:
print(f"[âWS] {params['response']['payloadData'][:200]}")
)
await page.goto('https://www.perplexity.ai/')
await page.wait_for_timeout(60000)
Runtime JS Evaluation via CDP
# Execute JS in page context
result = await cdp.send('Runtime.evaluate', {
'expression': 'JSON.stringify(window.__NEXT_DATA__)',
'returnByValue': True,
})
next_data = json.loads(result['result']['value'])
Step 6: Chrome Extension Development
Manifest v3 Extension for Traffic Capture
{
"manifest_version": 3,
"name": "pplx-sdk Traffic Capture",
"version": "1.0",
"permissions": [
"webRequest", "activeTab", "storage", "debugger"
],
"host_permissions": ["https://www.perplexity.ai/*"],
"background": {
"service_worker": "background.js"
},
"content_scripts": [{
"matches": ["https://www.perplexity.ai/*"],
"js": ["content.js"],
"run_at": "document_start"
}]
}
Background Script â Request Interception
// background.js
chrome.webRequest.onBeforeRequest.addListener(
(details) => {
if (details.url.includes('/rest/')) {
console.log('[pplx-capture]', details.method, details.url);
if (details.requestBody?.raw) {
const body = new TextDecoder().decode(
new Uint8Array(details.requestBody.raw[0].bytes)
);
chrome.storage.local.set({
[`req_${Date.now()}`]: {
url: details.url,
method: details.method,
body: JSON.parse(body),
timestamp: Date.now()
}
});
}
}
},
{ urls: ["https://www.perplexity.ai/rest/*"] },
["requestBody"]
);
Content Script â React State Extraction
// content.js â inject into page context
const script = document.createElement('script');
script.textContent = `
// Hook into React state updates
const origSetState = React.Component.prototype.setState;
React.Component.prototype.setState = function(state, cb) {
window.postMessage({
type: 'PPLX_STATE_UPDATE',
component: this.constructor.name,
state: JSON.parse(JSON.stringify(state))
}, '*');
return origSetState.call(this, state, cb);
};
`;
document.documentElement.appendChild(script);
// Listen for state updates
window.addEventListener('message', (event) => {
if (event.data.type === 'PPLX_STATE_UPDATE') {
chrome.runtime.sendMessage(event.data);
}
});
Step 7: Map Discoveries to SDK
| SPA Discovery | SDK Target | Action |
|---|---|---|
| React component state | domain/models.py |
Model the state shape |
| API fetch calls | transport/http.py |
Add endpoint methods |
| SSE event handlers | transport/sse.py |
Map event types |
| Service worker cache | shared/ |
Understand caching behavior |
| Auth token flow | shared/auth.py |
Token refresh logic |
| WebSocket frames | transport/ |
New WebSocket transport |
| GraphQL queries | domain/ |
Query/mutation services |
Step 8: SPA Source Code Graph
After runtime analysis, build a static code graph of the SPA source. Delegate to codegraph for structural analysis.
Source Map Recovery
# Extract original source paths from source maps
curl -s https://www.perplexity.ai/ | grep -oP 'src="(/[^"]*\.js)"' | while read -r url; do
echo "Checking: $url"
curl -s "https://www.perplexity.ai${url}.map" 2>/dev/null | \
python3 -c "import sys,json; d=json.load(sys.stdin); print('\n'.join(d.get('sources',[])))" 2>/dev/null
done | sort -u
Static Analysis (from recovered source or public repo)
# Component tree from source
grep -rn "export \(default \)\?function \|export const .* = (" src/ --include="*.tsx" --include="*.jsx"
# Import graph
grep -rn "import .* from " src/ --include="*.ts" --include="*.tsx" | \
awk -F: '{print $1 " â " $NF}' | sort -u
# Hook usage map
grep -rn "use[A-Z][a-zA-Z]*(" src/ --include="*.tsx" | \
grep -oP 'use[A-Z][a-zA-Z]*' | sort | uniq -c | sort -rn
# API call sites (fetch, axios, etc.)
grep -rn "fetch(\|axios\.\|api\.\|apiClient\." src/ --include="*.ts" --include="*.tsx"
Cross-Reference: Runtime â Static
| Runtime Discovery (spa-expert) | Static Discovery (codegraph) | Cross-Reference |
|---|---|---|
| Fiber tree component names | Source component definitions | Match names to source files |
| Hook state values | Hook implementations | Map state shape to hook logic |
| Network API calls | fetch()/axios call sites |
Confirm endpoints in source |
| Context provider values | createContext() definitions |
Map runtime state to types |
| Service worker routes | Workbox config in source | Validate caching strategy |
Perplexity.ai SPA Notes
Known Stack
- Framework: Next.js (React 18+)
- Bundler: Webpack (via Next.js, not raw Vite â skill covers both for broader SPA RE)
- State: React hooks + context (observed patterns)
- Streaming: SSE via fetch() with ReadableStream
- Auth: Cookie-based (
pplx.session-id)
Key DOM Selectors
// Query input
document.querySelector('textarea[placeholder*="Ask"]')
// Response area
document.querySelector('[class*="prose"]')
// Thread list
document.querySelector('[class*="thread"]')