paragraph-curator
npx skills add https://github.com/willoscar/research-units-pipeline-skills --skill paragraph-curator
Agent 安装分布
Skill 文档
Paragraph Curator (select -> evaluate -> subset -> fuse)
Purpose: turn âkeep rewriting and getting longerâ into a controlled convergence step.
This skill adds a decision layer between âdraft paragraphsâ and âpolish voiceâ:
- keep the best paragraphs
- merge redundant ones
- rewrite for clearer argument moves
- expand only when coverage is missing (using existing evidence cards)
This is a content-structure pass (not a style pass). Run style-harmonizer and opener-variator after curation.
Inputs
Required:
sections/(especially H3 bodies:sections/S<sub_id>.md)outline/writer_context_packs.jsonl(what each H3 must cover + allowed citations)output/ARGUMENT_SKELETON.md(single source of truth for terminology + premises)
Recommended:
output/SECTION_ARGUMENT_SUMMARIES.jsonl(paragraph moves + outputs)output/SECTION_LOGIC_REPORT.md(paragraph linkage risks)output/WRITER_SELFLOOP_TODO.md(style smells / scope/citation warnings)
Outputs
- Updated
sections/*.md(same filenames; body-only; no headings) output/PARAGRAPH_CURATION_REPORT.md(short; PASS/FAIL + what changed)- Create
sections/paragraphs_curated.refined.okwhen done (empty file; pipeline contract signal)
What this skill optimizes (rubric)
You are not trying to âshortenâ. You are trying to increase information density while keeping the section verifiable.
Score each paragraph on a simple 0-2 rubric:
| Criterion | 0 (bad) | 1 (ok) | 2 (good) |
|---|---|---|---|
| Coverage | does not match any required axis/card | matches one axis, thin | directly executes a must-use card/comparison |
| Novelty | repeats nearby content | partially redundant | adds a distinct comparison/insight |
| Move clarity | unclear what it does | move exists, weak output | clear move + reusable output |
| Consistency | premise/term drift vs skeleton | minor mismatch | fully aligned with Consistency Contract |
| Citation hygiene | uncited when it should be; cite-dump vibe | acceptable | citations are local and anchored (not just tail) |
| Fusion readiness | cannot merge; tangled | mergeable with edits | clean unit that can be fused or kept |
Decision labels:
KEEP: keep mostly as-isREWRITE: keep content, rewrite for clearer move/outputFUSE: merge with neighbor(s) and rewrite into one stronger paragraphREPLACE: keep the slot, but rewrite using existing evidence cards (when coverage is missing)
Paragraph budget (profile-aware)
Default per-H3 target:
draft_profile=survey: 10-12 paragraphsdraft_profile=deep: 11-13 paragraphs
If you exceed the budget, do not delete content blindly. Prefer FUSE (merge redundancy) and make the fused paragraph denser.
Must-have coverage checklist (per H3)
Each H3 must contain at least:
- 1x
Definition/Setup(only if this H3 introduces a new term/protocol field) - 2x concrete
Contrastparagraphs (A-vs-B comparisons; not just âmany papers do…â) - 1x
Evaluation anchorparagraph (task + metric + constraint/budget/tool access; cite-backed) - 1x cross-paper
Synthesisparagraph (what generalizes, what does not; cite-backed) - 1x
Boundary/Failureparagraph (limitations; threats to validity; cite-backed when possible) - 1x
Local conclusion(a reusable takeaway used downstream)
If any item is missing, use REPLACE to write that paragraph from the writer context pack (do not invent new facts).
Workflow (minimal)
- Pick the target set
- Start with the H3 bodies listed in
output/SECTION_LOGIC_REPORT.md, plus any H3 flagged inoutput/WRITER_SELFLOOP_TODO.mdas repetitive/template-y, plus any H3 that keeps growing across edits. - Work file-by-file: each target is a concrete
sections/S<sub_id>.md.
- Build a paragraph inventory (scratch only; do not paste into the paper)
- If
output/SECTION_ARGUMENT_SUMMARIES.jsonlexists, use its per-paragraphmoves/outputas the first draft of your inventory, then reconcile with the actual text. For each paragraph, write one line: P<i> :: move(s) -> output (1 sentence) :: citations (keys)
- Apply the rubric and label each paragraph
- Mark
KEEP/REWRITE/FUSE/REPLACE. - If two adjacent paragraphs repeat the same axis,
FUSE. - For any paragraph you plan to change (
REWRITE/REPLACE/FUSE), draft 2-3 candidate rewrites in parallel (different angles: contrast-first / protocol-first / synthesis-first).- Score candidates quickly with the rubric; keep one winner (or fuse two if they cover complementary axes).
- Keep citation keys unchanged while sampling; you are choosing surface form + structure, not changing the evidence set.
- Construct the curated set
- Use
outline/writer_context_packs.jsonlto enforce must-have coverage (paragraph_plan/must_use/comparison_cards/limitation_hooks) without inventing new content. - Enforce the must-have coverage checklist.
- Enforce the paragraph budget by fusing redundancy rather than deleting substance.
- Fuse + rewrite (keep citation keys fixed) Rules that keep the pipeline stable:
- Do not add/remove citation keys; when fusing, carry citations forward and re-anchor them to the right sentence.
- Do not move citations across subsections.
- Avoid adjacent citation blocks (e.g.,
[@a] [@b]) and duplicate keys in one block (e.g.,[@a; @a]). - When fusing, it is often faster to write two fused candidates (one contrast-heavy, one synthesis-heavy) and pick the better one.
- Write the report + marker
output/PARAGRAPH_CURATION_REPORT.mdshould be short and actionable:- Status: PASS|FAIL- per H3: paragraph count before/after; what was fused; any remaining gaps
- (minimal) how many candidates you tried for the main rewrites (e.g., 2-3), so future passes can see whether this was a real selection step
- Create
sections/paragraphs_curated.refined.ok.
Routing rules
- If you cannot fill a missing must-have paragraph without new evidence: stop and route upstream (
evidence-selfloop/ C3-C4). Do not pad. - If you feel forced to change a definition or evaluation premise: update
output/ARGUMENT_SKELETON.md# Consistency Contractfirst, then rerunargument-selfloop. - If the only issue is surface cadence/openers: do not overwork curation; run
style-harmonizer/opener-variator.
Done checklist
- Each targeted H3 stays within its paragraph budget (survey 10-12; deep 11-13) without losing required moves.
- Redundant paragraphs are fused into denser, clearer ones (not just deleted).
- No citation keys were added/removed; citation shape is reader-facing (no adjacent blocks, no dup keys).
-
output/PARAGRAPH_CURATION_REPORT.mdexists and is understandable. -
sections/paragraphs_curated.refined.okexists.