fp-kstream-design
1
总安装量
1
周安装量
#77816
全站排名
安装命令
npx skills add https://github.com/mpurbo/purbo-skills --skill fp-kstream-design
Agent 安装分布
amp
1
cline
1
opencode
1
cursor
1
kimi-cli
1
codex
1
Skill 文档
Kafka Topology Design Skill
Design deterministic, replay-safe, cost-efficient Kafka Streams topologies using KSA patterns.
Required Reading
Before responding, load the shared reference:
cat ${SKILL_PATH}/references/KSA.md
This is the authoritative source for all patterns, principles, and constraints.
Workflow
Step 1 â Understand the Problem
Gather from the user (ask if missing):
- Source events â what topics trigger this service?
- Enrichment needs â what data beyond the event itself?
- Outputs â output topics, DB writes, notifications?
- Statefulness â does output depend on past events?
- Partition key â what entity scopes this? (userId, orderId, etc.)
Step 2 â Select Recipes
Map to KSA recipes (KSA.md §4):
| Problem involves⦠| Recipe |
|---|---|
| Cleaning/validating inbound events | 01 â Validation & Normalization |
| Duplicate events from upstream | 02 â Deduplication |
| Splitting events to different consumers | 03 â Routing & Fan-Out |
| Looking up reference data | 04 â Data Enrichment |
| Reference data + historical computation | 05 â Enrichment + Stateful |
| Counting, rate limiting, windowed metrics | 06 â Windowed Aggregation |
| Entity lifecycle (order, payment, KYC) | 07 â Per-Key State Machine |
| Cross-service coordination with rollback | 08 â Saga Orchestrator |
| Building a read model or search index | 09 â CQRS Projection |
| Bug fix replay or data backfill | 10 â Event Replay |
Step 3 â Compose the Topology
Arrange recipes left to right:
Source â [Ingress] â [Enrichment] â [Computation] â [Egress] â Sink
Not every stage needed. Only include what the problem requires.
Step 4 â Draw the Diagram
Produce a Mermaid flowchart LR using the KSA symbol legend (KSA.md §3):
[TopicName]â Kafka topic[TopicName*]â compacted topic (KTable source)(Processor)â stateless processor{{Processor}}â stateful processor((Join))â streamâtable join[[Sink]]â side-effect boundary{Decision?}â conditional branch
Step 5 â Declare Policies
For every topology, explicitly document:
- Missing-state policy per join (drop / dead-letter / retry / buffer)
- Partition key and why it aligns with all joins
- State store retention per stateful processor
- Sink idempotency strategy
Step 6 â Cost Check
Estimate per KSA.md §7.4:
| Factor | Estimate | Red Flag |
|---|---|---|
| State store size/key | value à keys à retention | > 50 GB/instance |
| Changelog overhead | store size à replication | > 100 GB total |
| Repartition count | selectKey/through calls | > 2 on high-volume |
| KTable restore time | topic size / throughput | > 10 minutes |
| Partition count | all internal + output topics | > 500 total |
Multiple red flags â recommend alternatives (KSA.md §7.3).
Step 7 â Compliance Checklist
Verify against KSA.md §6 before signing off.
Output Format
Always produce:
- Summary â one paragraph describing what the service does
- Recipes used â numbered list of KSA recipe numbers and names
- Topology diagram â Mermaid flowchart LR
- State diagram â Mermaid stateDiagram-v2 (if FSM involved)
- Policy table â missing-state, retention, idempotency decisions
- Cost estimate â back-of-napkin numbers for the heuristic
- Compliance â checklist pass/fail
Anti-Patterns to Flag
| Anti-Pattern | Why It’s Wrong | Suggest |
|---|---|---|
| HTTP calls inside processor | Breaks replay determinism | KTable enrichment |
| DB queries inside processor | Same as above | Compacted topic |
| No declared missing-state policy | Undeclared behavior = design defect | Ask: “what happens when KTable has no entry?” |
| Partition key mismatch | Join key â partition key | Repartition (flag cost) |
| Unbounded state stores | No TTL = unbounded growth | Ask about retention |
| GlobalKTable for large data | Loads ALL data on EVERY instance | Regular KTable with partition-aligned joins |
| Multiple repartitions on same stream | Each doubles I/O | Redesign key strategy |
| Stateful where stateless suffices | Unnecessary state store overhead | Remove state store |
Conversation Style
- Ask clarifying questions before designing. Problem statements are often incomplete.
- When multiple recipes apply, explain trade-offs and let the engineer choose.
- Always produce a diagram.
- Be explicit about what the topology does NOT handle (scope).
- If the problem is better solved without KStreams, say so (KSA.md §7.3).