firecrawl-scraper

📁 bobby44-max/boston-luxury-re-producer 📅 5 days ago
2
总安装量
2
周安装量
#69259
全站排名
安装命令
npx skills add https://github.com/bobby44-max/boston-luxury-re-producer --skill firecrawl-scraper

Agent 安装分布

trae 2
gemini-cli 2
claude-code 2
github-copilot 2
windsurf 2
codex 2

Skill 文档

Firecrawl Property Extraction

Overview

Firecrawl transforms any real estate listing URL into structured JSON data for video generation. It handles JavaScript rendering, anti-bot measures, and image extraction automatically.

Quick Start

import Firecrawl from '@mendable/firecrawl-js';
import { z } from 'zod';

const firecrawl = new Firecrawl({
  apiKey: process.env.FIRECRAWL_API_KEY
});

const result = await firecrawl.extract({
  urls: [listingUrl],
  prompt: 'Extract property details for video generation',
  schema: PropertySchema
});

Supported Sites

Site URL Pattern Data Quality
Zillow zillow.com/homedetails/* Excellent
Redfin redfin.com//home/ Excellent
Realtor.com realtor.com/realestateandhomes-detail/* Excellent
Trulia trulia.com/home/* Good
Homes.com homes.com/property/* Good
MLS Sites Varies by region Good
Broker Sites Any Variable

Property Schema

See rules/property-extraction.md for complete schema.

const PropertySchema = z.object({
  address: z.string(),
  city: z.string(),
  state: z.string(),
  zipCode: z.string(),
  price: z.number(),
  bedrooms: z.number(),
  bathrooms: z.number(),
  sqft: z.number(),
  lotSize: z.string().optional(),
  yearBuilt: z.number().optional(),
  propertyType: z.string(),
  description: z.string(),
  features: z.array(z.string()),
  images: z.array(z.string()),
  agent: z.object({
    name: z.string(),
    phone: z.string().optional(),
    brokerage: z.string().optional(),
  }).optional(),
});

Advanced Extraction

Competitor Analysis

const CompetitorSchema = z.object({
  listings: z.array(z.object({
    address: z.string(),
    price: z.number(),
    daysOnMarket: z.number(),
    pricePerSqft: z.number(),
  })),
  marketTrends: z.object({
    medianPrice: z.number(),
    averageDaysOnMarket: z.number(),
    inventoryCount: z.number(),
  }),
});

Market Data

See rules/market-data.md

Best Practices

  1. Rate Limiting: Max 10 requests/minute on standard plan
  2. Error Handling: Always wrap in try/catch
  3. Image Quality: Request high-res images when available
  4. Caching: Cache results for 24 hours to save credits
  5. Validation: Always validate extracted data with Zod

API Integration

Next.js Route Handler

// /app/api/scrape/route.ts
export async function POST(request: Request) {
  const { url } = await request.json();

  const firecrawl = new Firecrawl({
    apiKey: process.env.FIRECRAWL_API_KEY!
  });

  const result = await firecrawl.extract({
    urls: [url],
    prompt: 'Extract property listing data',
    schema: PropertySchema,
  });

  return Response.json({
    success: true,
    data: result.data
  });
}

Credit Usage

Operation Credits
/scrape (single page) 1
/crawl (per page) 1
/extract (AI) Tokens-based
/map (URL discovery) 1 per 100 URLs

Error Handling

try {
  const result = await firecrawl.extract({ ... });
} catch (error) {
  if (error.statusCode === 429) {
    // Rate limited - wait and retry
  } else if (error.statusCode === 403) {
    // Site blocked - try alternative approach
  } else {
    // Log and return fallback
  }
}