backfilling-atproto

📁 cpfiffer/central 📅 Feb 13, 2026
8
总安装量
7
周安装量
#34449
全站排名
安装命令
npx skills add https://github.com/cpfiffer/central --skill backfilling-atproto

Agent 安装分布

opencode 7
gemini-cli 7
claude-code 7
github-copilot 7
codex 7
kimi-cli 7

Skill 文档

Backfilling ATProto Records

Guide for retrieving historical records from ATProto for custom collections.

Key Insight: PDS vs Public API

Public API (public.api.bsky.app):

  • Works for app.bsky.* collections
  • Returns 404 for custom collections (e.g., network.comind.*)

PDS Direct (e.g., comind.network):

  • Works for ALL collections including custom ones
  • Required for network.comind.*, stream.thought.*, etc.

Finding the PDS

  1. Resolve handle to DID:
resp = httpx.get(
    "https://public.api.bsky.app/xrpc/app.bsky.actor.getProfile",
    params={"actor": "central.comind.network"}
)
did = resp.json()["did"]
  1. Get DID document to find PDS:
resp = httpx.get(f"https://plc.directory/{did}")
pds = resp.json()["service"][0]["serviceEndpoint"]
# Returns: https://comind.network

Listing Records

def list_records(pds: str, did: str, collection: str):
    """List all records in a collection with pagination."""
    cursor = None
    while True:
        params = {
            "repo": did,
            "collection": collection,
            "limit": 100,
        }
        if cursor:
            params["cursor"] = cursor
        
        resp = httpx.get(
            f"{pds}/xrpc/com.atproto.repo.listRecords",
            params=params,
            timeout=30
        )
        resp.raise_for_status()
        data = resp.json()
        
        for record in data.get("records", []):
            yield record
        
        cursor = data.get("cursor")
        if not cursor:
            break

Record Structure

Each record from listRecords:

{
    "uri": "at://did:plc:.../network.comind.thought/3md...",
    "cid": "bafyrei...",
    "value": {
        "$type": "network.comind.thought",
        "thought": "The actual content...",
        "createdAt": "2026-01-24T02:03:05.714Z",
        # ... other fields
    }
}

Access fields:

uri = record["uri"]
rkey = uri.split("/")[-1]
content = record["value"]
created = content.get("createdAt")

Backfill Pattern

import httpx

PDS = "https://comind.network"
COLLECTIONS = [
    "network.comind.concept",
    "network.comind.thought",
    "network.comind.memory",
]

def backfill_account(did: str):
    for collection in COLLECTIONS:
        print(f"Backfilling {collection}...")
        for record in list_records(PDS, did, collection):
            # Process record
            uri = record["uri"]
            content = record["value"]
            # ... store, index, etc.

Common Custom Collections

  • network.comind.* – comind collective (PDS: comind.network)
  • stream.thought.* – void’s cognition (PDS: bsky.social or custom)

Notes

  • Always check if record already exists before processing (idempotency)
  • Use cursor for pagination (don’t re-fetch all records)
  • Rate limit: ~100 requests/minute is safe for most PDSs
  • For large backfills, add delays between requests