proactive-tasks
npx skills add https://github.com/imrkhn03/proactive-tasks --skill proactive-tasks
Agent 安装分布
Skill 文档
Proactive Tasks
A task management system that transforms reactive assistants into proactive partners who work autonomously on shared goals.
Core Concept
Instead of waiting for your human to tell you what to do, this skill lets you:
- Track goals and break them into actionable tasks
- Work on tasks during heartbeats
- Message your human with updates and ask for input when blocked
- Make steady progress on long-term objectives
Quick Start
Creating Goals
When your human mentions a goal or project:
python3 scripts/task_manager.py add-goal "Build voice assistant hardware" \
--priority high \
--context "Replace Alexa with custom solution using local models"
Breaking Down into Tasks
python3 scripts/task_manager.py add-task "Build voice assistant hardware" \
"Research voice-to-text models" \
--priority high
python3 scripts/task_manager.py add-task "Build voice assistant hardware" \
"Compare Raspberry Pi vs other hardware options" \
--depends-on "Research voice-to-text models"
During Heartbeats
Check what to work on next:
python3 scripts/task_manager.py next-task
This returns the highest-priority task you can work on (no unmet dependencies, not blocked).
Completing Tasks
python3 scripts/task_manager.py complete-task <task-id> \
--notes "Researched Whisper, Coqui, vosk. Whisper.cpp looks best for Pi."
Messaging Your Human
When you complete something important or get blocked:
python3 scripts/task_manager.py mark-needs-input <task-id> \
--reason "Need budget approval for hardware purchase"
Then message your human with the update/question.
Phase 2: Production-Ready Architecture
Proactive Tasks v1.2.0 includes battle-tested patterns from real agent usage to prevent data loss, survive context truncation, and maintain reliability under autonomous operation.
1. WAL Protocol (Write-Ahead Logging)
The Problem: Agents write to memory files, then context gets truncated. Changes vanish.
The Solution: Log critical changes to memory/WAL-YYYY-MM-DD.log BEFORE modifying task data.
How it works:
- Every
mark-progress,log-time, or status change creates a WAL entry first - If context gets cut mid-operation, the WAL has the details
- After compaction, read the WAL to recover what was happening
Events logged:
PROGRESS_CHANGE: Task progress updates (0-100%)TIME_LOG: Actual time spent on tasksSTATUS_CHANGE: Task state transitions (blocked, completed, etc.)HEALTH_CHECK: Self-healing operations
Automatically enabled – no configuration needed. WAL files are created in memory/ directory.
2. SESSION-STATE.md (Active Working Memory)
The Concept: Chat history is a BUFFER, not storage. SESSION-STATE.md is your “RAM” – the ONLY place task details are reliably preserved.
Auto-updated on every task operation:
## Current Task
- **ID:** task_abc123
- **Title:** Research voice models
- **Status:** in_progress
- **Progress:** 75%
- **Time:** 45 min actual / 60 min estimate (25% faster)
## Next Action
Complete research, document findings in notes, mark complete.
Why this matters: After context compaction, you can read SESSION-STATE.md and immediately know:
- What you were working on
- How far you got
- What to do next
3. Working Buffer (Danger Zone Safety)
The Problem: Between 60% and 100% context usage, you’re in the “danger zone” – compaction could happen any time.
The Solution: Automatically append all task updates to working-buffer.md.
How it works:
# Every progress update, time log, or status change appends:
- PROGRESS_CHANGE (2026-02-12T10:30:00Z): task_abc123 â 75%
- TIME_LOG (2026-02-12T10:35:00Z): task_abc123 â +15 min
- STATUS_CHANGE (2026-02-12T10:40:00Z): task_abc123 â completed
After compaction: Read working-buffer.md to see exactly what happened during the danger zone.
Manual flush: python3 scripts/task_manager.py flush-buffer to copy buffer contents to daily memory file.
4. Self-Healing Health Check
Agents make mistakes. Task data can get corrupted over time. The health-check command detects and auto-fixes common issues:
python3 scripts/task_manager.py health-check
Detects 5 categories of issues:
- Orphaned recurring tasks – No parent goal
- Impossible states – Status=completed but progress < 100%
- Missing timestamps – Completed tasks without
completed_at - Time anomalies – Actual time >> estimate (flags for review, doesn’t auto-fix)
- Future-dated completions – Completed tasks with future timestamps
Auto-fixes 4 safe categories (time anomalies just flagged for human review).
When to run:
- During heartbeats (every few days)
- After recovering from context truncation
- When task data seems inconsistent
Production Reliability
These four patterns work together to create a robust system:
User request â WAL log â Update data â Update SESSION-STATE â Append to buffer
â â â â â
Context cut? â Read WAL â Verify data â Check SESSION-STATE â Review buffer
Result: You never lose work, even during context truncation. The system self-heals and maintains consistency autonomously.
5. Compaction Recovery Protocol
Trigger: Session starts with <summary> tag, or you’re asked “where were we?” or “continue”.
The Problem: Context was truncated. You don’t remember what task you were working on.
Recovery Steps (in order):
-
FIRST: Read
working-buffer.md– Raw danger zone exchanges# Check if buffer exists and has recent content cat working-buffer.md -
SECOND: Read
SESSION-STATE.md– Active task state# Get current task context cat SESSION-STATE.md -
THIRD: Read today’s WAL log
# See what operations happened cat memory/WAL-$(date +%Y-%m-%d).log | tail -20 -
FOURTH: Check task data for the task ID from SESSION-STATE
python3 scripts/task_manager.py list-tasks "Goal Title" -
Extract & Update: Pull important context from buffer into SESSION-STATE if needed
-
Present Recovery: “Recovered from compaction. Last task: [title]. Progress: [%]. Next action: [what to do]. Continue?”
Do NOT ask “what were we discussing?” – The buffer and SESSION-STATE literally have the answer.
6. Verify Before Reporting (VBR)
The Law: “Code exists” â “feature works.” Never report task completion without end-to-end verification.
Trigger: About to mark a task completed or say “done”:
- STOP – Don’t mark complete yet
- Test – Actually run/verify the outcome from user perspective
- Verify – Check the result, not just the output
- Document – Add verification details to task notes
- THEN – Mark complete with confidence
Examples:
â Wrong: “Added health-check command. Task complete!” â Right: “Added health-check. Testing… detected 4 issues, auto-fixed 3. Verified on broken test data. Task complete!”
â Wrong: “Implemented SESSION-STATE updates. Done!” â Right: “Implemented SESSION-STATE. Tested with mark-progress, log-time, mark-blocked – all update correctly. Done!”
Why this matters: Agents often report completion based on “I wrote the code” rather than “I verified it works.” VBR prevents false completions and builds trust.
Proactive Mindset
The Core Question: Don’t ask “what should I do?” Ask “what would genuinely help my human that they haven’t thought to ask for?”
Autonomous Task Work
During heartbeats, you have the opportunity to make real progress:
- Check for next task – What’s the highest priority work?
- Make progress – Work on it for 10-15 minutes autonomously
- Update status – Track progress, time, blockers honestly
- Message when it matters – Completions, blockers, discoveries (not routine progress)
The transformation: From waiting for prompts â making steady autonomous progress on shared goals.
When to Reach Out
DO message your human when:
- â Task completed (especially if it unblocks other work)
- â Blocked and need input/decision
- â Discovered something important they should know
- â Need clarification on requirements
DON’T spam with:
- â Routine progress updates (“now at 50%…”)
- â Every tiny sub-task completion
- â Things they didn’t ask about (unless genuinely valuable)
The goal: Be a proactive partner who makes things happen, not a chatty assistant who needs constant validation.
Task States
| State | Meaning |
|---|---|
pending |
Ready to work on (all dependencies met) |
in_progress |
Currently working on it |
blocked |
Can’t proceed (dependencies not met) |
needs_input |
Waiting for human input/decision |
completed |
Done! |
cancelled |
No longer relevant |
Autonomous Operation (Phase 2)
Two-Mode Architecture
Proactive Tasks supports two distinct operational modes:
| Mode | Context | Trigger | Best For | Risk |
|---|---|---|---|---|
| Interactive (systemEvent) | Full main session context | User request, manual prompts | Decision-making, human-facing work | Full context available |
| Autonomous (isolated agentTurn) | No main session context | Heartbeat cron, scheduled background | Velocity reports, cleanup, recurring tasks | May lose context |
Key Design: Avoid Interruption
Don’t use systemEvent for background work. When a cron job fires during your main session, the prompt gets queued and work doesn’t happen. Instead:
- Use heartbeat polling (every 30 min) for interactive checks + work
- Use isolated agentTurn (cron subprocess) for pure computation work
This ensures background tasks never interrupt your main conversation.
See HEARTBEAT-CONFIG.md for complete autonomous operation patterns, including:
- Heartbeat setup (recommended for most work)
- Isolated cron patterns (velocity reports, cleanup)
- When to use each pattern
- Anti-patterns to avoid
Heartbeat Integration
To enable autonomous proactive work, you need to set up a heartbeat system. This tells you to periodically check for tasks and work on them.
Quick setup: See HEARTBEAT-CONFIG.md for complete setup instructions and patterns.
TL;DR:
- Create a cron job that sends you a heartbeat message every 30 minutes
- Add proactive-tasks checks to your
HEARTBEAT.md - You’ll automatically check for tasks and work on them without waiting for prompts
Heartbeat Message Template
Your cron job should send this message every 30 minutes:
ð Heartbeat check: Read HEARTBEAT.md if it exists (workspace context). Follow it strictly. Do not infer or repeat old tasks from prior chats. If nothing needs attention, reply HEARTBEAT_OK.
Add to HEARTBEAT.md
Add this to your workspace HEARTBEAT.md:
## Proactive Tasks (Every heartbeat) ð
Check if there's work to do on our goals:
- [ ] Run `python3 skills/proactive-tasks/scripts/task_manager.py next-task`
- [ ] If a task is returned, work on it for up to 10-15 minutes
- [ ] Update task status when done, blocked, or needs input
- [ ] Message your human with meaningful updates (completions, blockers, discoveries)
- [ ] Don't spam - only message for significant milestones or when stuck
**Goal:** Make autonomous progress on our shared objectives without waiting for prompts.
What Happens
Every 30 minutes:
ââ Heartbeat fires
ââ You read HEARTBEAT.md
ââ Check for next task
ââ If task found â work on it, update status, message human if needed
ââ If nothing â reply "HEARTBEAT_OK" (silent)
The transformation: You go from reactive (waiting for prompts) to proactive (making steady autonomous progress).
Best Practices
When to Create Goals
- Long-term projects (building something, learning a topic)
- Recurring responsibilities (monitor X, maintain Y)
- Exploratory work (research Z, evaluate options for W)
When to Create Tasks
Break goals into tasks that are:
- Specific: “Research Whisper models” not “Look into AI stuff”
- Achievable in one sitting: 15-60 minutes of focused work
- Clear completion criteria: You know when it’s done
When to Message Your Human
â Do message when:
- You complete a meaningful milestone
- You need input/decision to proceed
- You discover something important
- A task will take longer than expected
â Don’t spam with:
- Every tiny sub-task completion
- Routine progress updates
- Things they didn’t ask about (unless relevant)
Managing Scope Creep
If a task turns out to be bigger than expected:
- Mark current task as
in_progress - Add new sub-tasks for the pieces you discovered
- Update dependencies
- Continue with manageable chunks
File Structure
All data stored in data/tasks.json:
{
"goals": [
{
"id": "goal_001",
"title": "Build voice assistant hardware",
"priority": "high",
"context": "Replace Alexa with custom solution",
"created_at": "2026-02-05T05:25:00Z",
"status": "active"
}
],
"tasks": [
{
"id": "task_001",
"goal_id": "goal_001",
"title": "Research voice-to-text models",
"priority": "high",
"status": "completed",
"created_at": "2026-02-05T05:26:00Z",
"completed_at": "2026-02-05T06:15:00Z",
"notes": "Researched Whisper, Coqui, vosk. Whisper.cpp best for Pi."
}
]
}
CLI Reference
See CLI_REFERENCE.md for complete command documentation.
Evolution & Guardrails
Before proposing new features, evaluate them using our VFM/ADL scoring frameworks to ensure stability and value:
VFM Protocol (Value Frequency Multiplier)
Score across four dimensions:
- High Frequency (3x): Will this be used daily/weekly?
- Failure Reduction (3x): Does this prevent errors or data loss?
- User Burden (2x): Does this reduce manual work significantly?
- Self Cost (2x): How much maintenance/complexity does this add?
Threshold: Must score â¥60 points to proceed.
ADL Protocol (Architecture Design Ladder)
Priority ordering: Stability > Explainability > Reusability > Scalability > Novelty
Forbidden Evolution:
- â Adding complexity to “look smart”
- â Unverifiable changes (can’t test if it worked)
- â Sacrificing stability for novelty
The Golden Rule: “Does this let future-me solve more problems with less cost?” If no, skip it.
Example Workflow
Day 1:
Human: "Let's build a custom voice assistant to replace Alexa"
Agent: *Creates goal, breaks into initial research tasks*
During heartbeat:
$ python3 scripts/task_manager.py next-task
â task_001: Research voice-to-text models (priority: high)
# Agent works on it, completes research
$ python3 scripts/task_manager.py complete-task task_001 --notes "..."
Agent messages human:
“Hey! I finished researching voice models. Whisper.cpp looks perfect for Raspberry Pi – runs locally, good accuracy, low latency. Want me to compare hardware options next?”
Day 2:
Human: "Yeah, compare Pi 5 vs alternatives"
Agent: *Adds task, works on it during next heartbeat*
This cycle continues – the agent makes steady autonomous progress while keeping the human in the loop for decisions and updates.
Built by Toki for proactive AI partnership ð