manuscript-review
npx skills add https://github.com/mathews-tom/praxis-skills --skill manuscript-review
Agent 安装分布
Skill 文档
Manuscript Review Skill
Purpose
Execute a comprehensive, multi-pass diagnostic audit of an academic or technical manuscript, producing a structured improvement report that identifies issues across 24 audit dimensions â from macro-coherence and argumentative architecture through claims-evidence calibration, narrative flow, prose microstructure, rendered visual inspection, and cross-element coherence, down to citation hygiene and reproducibility.
The output is a prioritized, actionable improvement plan â not a line edit. The goal is to surface structural, logical, and clarity issues that authors systematically miss because they’re too close to the text.
Optimized for arXiv/preprint submissions with flexible compliance standards.
Companion skill: manuscript-provenance audits whether manuscript content
(numbers, tables, figures, ordering, terminology) is computationally derived
from code and scripts. This skill audits the document as prose; that skill
audits computational grounding. Run both for complete pre-publication coverage.
Boundary Agreement with manuscript-provenance
| Concern | This skill (manuscript-review) | manuscript-provenance |
|---|---|---|
| Reproducibility | Does the paper describe enough to reproduce? (§6) | Does the code actually produce what the paper claims? (§1, §7) |
| Figures/Tables | Legible, accessible, well-formatted? (§12) | Generated by scripts, not manual entry? (§2, §3) |
| Rendered visuals | Readable at print scale? Floats near references? (§23) | Figure generation script produces correct format? (§3) |
| Hyperparameters | Listed in the paper with rationale? (§6) | Values trace to config files, not hardcoded? (§1, §8) |
| Code availability | Statement exists in the paper? (§17) | Repo URL valid, README accurate, pipeline works? (§11) |
| Terminology | Abbreviations consistent within document? (§14) | Terms match code identifiers? (§5) |
| Significant figures | Consistent precision within document? (§12) | Precision matches script output? (§2) |
| Figure format | Appropriate format for document quality? (§12) | Format generated by script, not manually exported? (§3) |
| Computational cost | Reported in the paper? (§7) | Values trace to benchmarking scripts? (§1) |
| Macro-prose coherence | Prose framing appropriate for injected value? (§24) | Value traced to code, macro manifest produced? (§4) |
| Cross-element consistency | Prose, captions, figures, tables mutually consistent? (§24) | All elements from same run/pipeline output? (§9) |
Rule: This skill never opens the codebase. manuscript-provenance never judges prose quality. Each reads the other’s report when available.
Integration point â Macro Manifest: manuscript-provenance produces a
macro manifest as part of its §4 audit: a structured list of every
macro-injected value, its resolved numeric value, its source (script + output
file), and its location(s) in the manuscript text. This skill’s Pass 13
(Cross-Element Coherence) consumes that manifest to check whether the prose
surrounding each injected value is appropriate for the actual value. If no
provenance report exists, this skill extracts macro values directly from
.tex source (less precise â no source tracing, but coherence check still
runs).
Workflow
1. Ingest
Read the uploaded manuscript. Accept PDF, DOCX, LaTeX source, or Markdown. If multiple files are uploaded (e.g., main text + supplementary), process all of them.
Identify:
- Target venue (defaults to arXiv/preprint; adjust if conference/journal submission)
- Submission type (full paper, technical report, thesis chapter, etc.)
- Any specific concerns the user raised â these get priority in the report
For arXiv submissions, compliance checks are advisory. Focus on technical quality, reproducibility, and clarity rather than strict formatting rules.
2. Load the Checklist
Read references/checklist.md â the comprehensive 24-section, ~175-checkpoint
refactoring checklist. Every audit pass is structured against this checklist.
Read references/checklist.md
3. Multi-Pass Audit
Execute the following passes sequentially. Each pass maps to one or more checklist sections. Work systematically â for each checkpoint:
- PASS: Note briefly, move on
- FAIL: Document with exact location (section, paragraph, line), specific defect, concrete fix required
- N/A: Mark if not applicable to this manuscript type
Pass 1 â Structural Integrity (Checklist §1, §4, §5, §10)
- Trace the thesis-thread from abstract through conclusion
- Verify section-level necessity and logical dependency ordering
- Check introduction funnel structure and contribution enumeration
- Verify conclusion contains no new information and maps 1:1 to stated contributions
- Assess related work organization (taxonomic vs. annotated) and differentiation
Pass 2 â Abstract & Title Calibration (Checklist §2, §3)
- Abstract functional completeness (context â gap â approach â results â implication)
- Quantitative specificity in abstract
- Title precision-scope alignment
- Keyword-abstract coherence
Pass 3 â Technical Rigor (Checklist §6, §7)
- Reproducibility sufficiency of methodology (document-level: does the paper describe enough? Code-level verification deferred to manuscript-provenance)
- Assumption explicitness and notation consistency
- Baseline adequacy, dataset characterization, statistical rigor
- Effect size reporting, evaluation metric justification
- Computational cost reporting (checks paper reports it; value tracing to benchmarking scripts deferred to manuscript-provenance)
Pass 4 â Argumentation Quality (Checklist §8, §9)
- Discussion introduces no new results
- Alternative explanations considered
- Generalizability boundaries stated
- Limitations genuine (not performative), preemptively addressing reviewer objections
- Threat-to-validity taxonomy coverage
Pass 5 â Citation & Reference Hygiene (Checklist §11)
- Citation-reference bijection (no orphans in either direction)
- Style conformance to target venue
- Primary source preference over secondary citations
- Preprint-to-publication status check
- Citation placement (claim-level, not paragraph-level)
- Retraction check advisory
Pass 6 â Visual & Tabular Quality (Checklist §12)
- Sequential callout ordering
- Resolution and legibility assessment
- Colorblind accessibility
- Axis labels with units, consistent visual language
- Table alignment and significant figure consistency
Pass 7 â Prose Mechanics (Checklist §13, §14, §15)
- Tense consistency (recommendations, not strict requirements)
- Hedging calibration (neither overclaiming nor vacuous)
- Passive voice patterns (advisory)
- Nominalization reduction opportunities
- Clarity and precision (marketing language advisory for arXiv)
- Abbreviation hygiene (first-use expansion, consistency)
- Mathematical typesetting consistency
Pass 8 â Best Practices & Reproducibility (Checklist §16, §17, §18, §19)
- Supplementary material cross-reference integrity
- Code/data availability statements exist in the paper (verification that claimed repos are valid and pipelines work deferred to manuscript-provenance)
- License compatibility for third-party assets
- Hyperlink verification and reference integrity
- Overall clarity and accessibility assessment
Pass 9 â Claims-Evidence Calibration (Checklist §20)
This is a dedicated pass through every assertion in the manuscript.
For each claim:
- Grade claim strength: strong/definitive (“X causes Y”), moderate/qualified (“X improves Y under conditions Z”), or hedged/tentative (“X may contribute to Y”)
- Grade evidence strength: direct experimental, indirect/correlational, citation-only, analogical, or no evidence
- Flag mismatches:
- Overclaim: Strong claim + weak evidence â soften the claim or add evidence
- Underclaim: Hedged language + strong evidence â sharpen the language
- Orphaned claim: Any strength + no evidence â add evidence or remove claim
- Audit causal vs. correlational language against study design
- Check generalization scope against actual experimental conditions
- Verify comparative claims (“outperforms”, “better than”) against head-to-head evaluations actually present in the paper
- Flag implicit claims (e.g., “Unlike prior work, our approach handles X” implies prior work cannot â verify this)
- Check negation claims for evidence of absence vs. absence of evidence
This pass is HIGH priority. Claims-evidence mismatch is the single most common reason reviewers reject papers. An overclaim in the abstract poisons the entire reading.
Pass 10 â Narrative Flow & Coherence (Checklist §21)
Read the manuscript linearly, tracking the reader’s cognitive state. At each sentence and paragraph boundary, check:
- Does this sentence follow from the previous one, or does the reader need to make an inferential leap?
- Does this paragraph’s opening sentence state its point, or is the point buried?
- Does each sentence start with known information and end with new information (given-new contract)?
- Are cross-references between sentences ordered so the reader moves forward through the text, not zigzagging back?
- Does the last sentence of each paragraph connect to the first sentence of the next paragraph?
- Are there logic gaps where a premise is skipped because the author knows it implicitly?
- Does every setup/promise within a section get its payoff within that section?
- Does each section have a discernible arc (setup â content â landing)?
Flag any location where a domain-expert reader would need to re-read, scroll back, or pause to reconstruct the logical connection. These are flow breaks.
This pass is HIGH priority. Papers with strong results but poor narrative flow exhaust reviewers. A reader who has to fight the text stops trusting the author.
Pass 11 â Prose Microstructure (Checklist §22)
Sentence-level and paragraph-level patterns that compound into readability problems:
- Ambiguous referents: “this”, “it”, “they” without clear antecedents
- Information density spikes: paragraphs introducing too many new concepts at once
- Sentences requiring multiple re-reads: excessive clause nesting, misplaced modifiers, garden-path constructions
- Broken parallel structure in lists, comparisons, sequences
- Semantic redundancy: same point restated in nearby paragraphs without purpose
- Long-distance references: concepts introduced and referenced many paragraphs later without re-anchoring
- Dangling modifiers: “Using gradient descent, the loss function converged”
This pass is MEDIUM priority on individual items but compounds â a manuscript with 20 ambiguous pronouns, 10 density spikes, and 5 dangling modifiers is materially harder to read even though no single instance is fatal.
Pass 12 â Rendered Document Inspection (Checklist §23)
This pass requires the compiled PDF. If only LaTeX source is provided, ask the user for the compiled PDF or compile it.
Open the PDF and inspect every page at actual print scale:
- Figures: For each figure, zoom to the size it will appear at in the
final document. Check:
- All text (axis labels, tick labels, legend, annotations) readable
- No label overlap, collision, or truncation
- Legend placement not covering data
- Annotations pointing to correct elements
- Tables: Check column alignment, text wrapping, no content overflow
- Floats: For each figure/table, locate its first text reference. Measure the page distance. Flag anything >1 page away.
- Page breaks: Check no table splits across pages (unless intentionally long), no equation orphaned from its introduction, no header stranded at page bottom
- Margins: Check no content bleeds outside margins (equations, URLs, wide tables, wide figures)
- Visual consistency: Font sizes across figures comparable, color usage consistent
This pass is HIGH priority. A paper with illegible axis labels or a table split across pages signals carelessness to reviewers regardless of technical quality. These defects are invisible from source and the author often doesn’t notice because they read the paper in their editor, not in the compiled output.
Pass 13 â Cross-Element Coherence (Checklist §24)
Read the manuscript as an integrated system. For each figure, table, and macro-injected value:
- Collect the element cluster: The visual/data itself, its caption, every prose passage that references it, and any macro values appearing in or near those passages
- Check four-way consistency: Does the prose claim match the visual? Does the caption describe the current content? Do the numbers agree across text, table, and figure? Does the qualitative language match the quantitative values?
- Check cross-reference accuracy: Every
\refpoints to the element the surrounding prose describes. After figure reordering, references often point to the wrong visual. - Check macro-prose coherence: When a macro injects a number, read the sentence it sits in. Does the qualitative framing (“modest”, “dramatic”, “marginal”, “substantial”) match the actual numeric value? This is the handoff from manuscript-provenance: provenance traces the value to code, this pass verifies the prose wrapping that value is appropriate.
- Check temporal consistency: Do all elements appear to come from the same experimental run? A figure from one run and a table from another is a coherence failure even if both are individually correct.
If a manuscript-provenance report exists, load its macro manifest (list of
all traced macro values with locations and source values) and use it as
input for step 4. If no provenance report exists, extract macro values
directly from .tex source.
This pass is HIGH priority. Cross-element incoherence is the most insidious class of manuscript defect â each piece looks fine in isolation, the system is broken. Reviewers notice because they read the document linearly and encounter contradictions the author can’t see because they edit pieces independently.
Note for arXiv: Ethics statements, anonymization, page limits, and strict formatting requirements are marked N/A by default. Focus on technical quality, reproducibility, and clarity.
4. Generate Refactoring Report
Produce the report as a structured document. Use references/report-template.md
as the output format.
Read references/report-template.md
Report structure:
-
Executive Summary â Overall quality assessment (Publication-ready / Recommend revisions / Needs work). Top 5 high-priority improvements.
-
Per-Section Diagnostics â For each manuscript section, the specific issues found, mapped to checklist checkpoint IDs. Severity tagged as HIGH (impacts clarity/credibility), MEDIUM (noticeable quality gap), or LOW (polish/optional improvement).
-
Cross-Cutting Issues â Problems that span multiple sections (e.g., inconsistent notation, citation patterns, clarity patterns).
-
Priority Queue â All issues ranked by impact à effort. HIGH-impact items first, then MEDIUM items ordered by estimated fix effort (lowest effort first = quick wins).
-
Checklist Status â The full 24-section checklist with pass/needs-work/not-applicable status per checkpoint, referencing specific locations in the manuscript.
5. Triage and Priority Report
After completing the full scan, categorize issues:
- HIGH â Impacts technical credibility or reproducibility (missing baselines, orphaned claims, insufficient methodology details, broken references)
- MEDIUM â Reduces clarity or professional quality (inconsistent notation, vague claims, poor figure quality)
- LOW â Polish issues (citation formatting variations, minor typesetting, style preferences)
For arXiv submissions, focus HIGH priority on technical quality and reproducibility. Compliance items (ethics statements, formatting) are typically LOW priority or N/A.
Present the priority queue first, then the detailed findings.
6. Output
Save the report as a Markdown file in the same directory as the manuscript,
named [manuscript-name]-review-report.md.
Present the file to the user with a concise summary:
- Quality assessment verdict
- Count of HIGH/MEDIUM/LOW priority items
- Top 3 recommended improvements
Core Principles
-
Focus on structure and clarity. This is a structural and technical audit. Sentence-level grammar is out of scope unless it forms a systematic pattern affecting readability.
-
Evidence-based findings. Every issue cites the specific manuscript location (section, paragraph, figure/table number). No vague “could be better.”
-
Balanced severity. HIGH priority for technical credibility and reproducibility issues. MEDIUM for clarity and professional quality. LOW for style preferences. ArXiv allows more flexibility than peer-reviewed venues.
-
Context-aware recommendations. Formatting and compliance requirements vary by venue. For arXiv, prioritize technical quality over strict formatting. For journal submissions, adjust accordingly.
-
Constructive framing. Frame findings as improvements to clarity, credibility, and reproducibility rather than as rejection risks. ArXiv is more forgiving; focus on making the work accessible and trustworthy.
-
Direct communication. Report issues as issues with specific fixes, not as vague suggestions. But recognize that many “rules” are guidelines for arXiv.
-
Systematic coverage. Work through the checklist methodically. Mark items as pass/needs-work/N/A based on actual content. ArXiv-specific items (anonymization, page limits, strict templates) default to N/A.
Example Invocation Patterns
User says any of:
- “Review my manuscript”
- “Check this paper before I submit”
- “Is this ready for submission”
- “Run pre-publication review”
- “Check my references”
- “Does the abstract work”
- “Review the methodology section”
- “Pre-submission checklist”
- “/manuscript-review”
All trigger this skill. Partial reviews (e.g., “just check citations”) still run the full audit â the user benefits from comprehensive diagnostics even when they only asked about one aspect.