audio-voice-recovery
npx skills add https://github.com/pproenca/dot-skills --skill audio-voice-recovery
Agent 安装分布
Skill 文档
Forensic Audio Research Audio Voice Recovery Best Practices
Comprehensive audio forensics and voice recovery guide providing CSI-level capabilities for recovering voice from low-quality, low-volume, or damaged audio recordings. Contains 45 rules across 8 categories, prioritized by impact to guide audio enhancement, forensic analysis, and transcription workflows.
When to Apply
Reference these guidelines when:
- Recovering voice from noisy or low-quality recordings
- Enhancing audio for transcription or legal evidence
- Performing forensic audio authentication
- Analyzing recordings for tampering or splices
- Building automated audio processing pipelines
- Transcribing difficult or degraded speech
Rule Categories by Priority
| Priority | Category | Impact | Prefix | Rules |
|---|---|---|---|---|
| 1 | Signal Preservation & Analysis | CRITICAL | signal- |
5 |
| 2 | Noise Profiling & Estimation | CRITICAL | noise- |
5 |
| 3 | Spectral Processing | HIGH | spectral- |
6 |
| 4 | Voice Isolation & Enhancement | HIGH | voice- |
7 |
| 5 | Temporal Processing | MEDIUM-HIGH | temporal- |
5 |
| 6 | Transcription & Recognition | MEDIUM | transcribe- |
5 |
| 7 | Forensic Authentication | MEDIUM | forensic- |
5 |
| 8 | Tool Integration & Automation | LOW-MEDIUM | tool- |
7 |
Quick Reference
1. Signal Preservation & Analysis (CRITICAL)
signal-preserve-original– Never modify original recordingsignal-lossless-format– Use lossless formats for processingsignal-sample-rate– Preserve native sample ratesignal-bit-depth– Use maximum bit depth for processingsignal-analyze-first– Analyze before processing
2. Noise Profiling & Estimation (CRITICAL)
noise-profile-silence– Extract noise profile from silent segmentsnoise-identify-type– Identify noise type before reductionnoise-adaptive-estimation– Use adaptive estimation for non-stationary noisenoise-snr-assessment– Measure SNR before and afternoise-avoid-overprocessing– Avoid over-processing and musical artifacts
3. Spectral Processing (HIGH)
spectral-subtraction– Apply spectral subtraction for stationary noisespectral-wiener-filter– Use Wiener filter for optimal noise estimationspectral-notch-filter– Apply notch filters for tonal interferencespectral-band-limiting– Apply frequency band limiting for speechspectral-equalization– Use forensic equalization to restore intelligibilityspectral-declip– Repair clipped audio before other processing
4. Voice Isolation & Enhancement (HIGH)
voice-rnnoise– Use RNNoise for real-time ML denoisingvoice-dialogue-isolate– Use source separation for complex backgroundsvoice-formant-preserve– Preserve formants during pitch manipulationvoice-dereverb– Apply dereverberation for room echovoice-enhance-speech– Use AI speech enhancement services for quick resultsvoice-vad-segment– Use VAD for targeted processingvoice-frequency-boost– Boost frequency regions for specific phonemes
5. Temporal Processing (MEDIUM-HIGH)
temporal-dynamic-range– Use dynamic range compression for level consistencytemporal-noise-gate– Apply noise gate to silence non-speech segmentstemporal-time-stretch– Use time stretching for intelligibilitytemporal-transient-repair– Repair transient damage (clicks, pops, dropouts)temporal-silence-trim– Trim silence and normalize before export
6. Transcription & Recognition (MEDIUM)
transcribe-whisper– Use Whisper for noise-robust transcriptiontranscribe-multipass– Use multi-pass transcription for difficult audiotranscribe-segment– Segment audio for targeted transcriptiontranscribe-confidence– Track confidence scores for uncertain wordstranscribe-hallucination– Detect and filter ASR hallucinations
7. Forensic Authentication (MEDIUM)
forensic-enf-analysis– Use ENF analysis for timestamp verificationforensic-metadata– Extract and verify audio metadataforensic-tampering– Detect audio tampering and splicesforensic-chain-custody– Document chain of custody for evidenceforensic-speaker-id– Extract speaker characteristics for identification
8. Tool Integration & Automation (LOW-MEDIUM)
tool-ffmpeg-essentials– Master essential FFmpeg audio commandstool-sox-commands– Use SoX for advanced audio manipulationtool-python-pipeline– Build Python audio processing pipelinestool-audacity-workflow– Use Audacity for visual analysis and manual editingtool-install-guide– Install audio forensic toolchaintool-batch-automation– Automate batch processing workflowstool-quality-assessment– Measure audio quality metrics
Essential Tools
| Tool | Purpose | Install |
|---|---|---|
| FFmpeg | Format conversion, filtering | brew install ffmpeg |
| SoX | Noise profiling, effects | brew install sox |
| Whisper | Speech transcription | pip install openai-whisper |
| librosa | Python audio analysis | pip install librosa |
| noisereduce | ML noise reduction | pip install noisereduce |
| Audacity | Visual editing | brew install audacity |
Workflow Scripts (Recommended)
Use the bundled scripts to generate objective baselines, create a workflow plan, and verify results.
scripts/preflight_audio.py– Generate a forensic preflight report (JSON or Markdown).scripts/plan_from_preflight.py– Create a workflow plan template from the preflight report.scripts/compare_audio.py– Compare objective metrics between baseline and processed audio.
Example usage:
# 1) Analyze and capture baseline metrics
python3 skills/.experimental/audio-voice-recovery/scripts/preflight_audio.py evidence.wav --out preflight.json
# 2) Generate a workflow plan template
python3 skills/.experimental/audio-voice-recovery/scripts/plan_from_preflight.py --preflight preflight.json --out plan.md
# 3) Compare baseline vs processed metrics
python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py \
--before evidence.wav \
--after enhanced.wav \
--format md \
--out comparison.md
Forensic Preflight Workflow (Do This Before Any Changes)
Align preflight with SWGDE Best Practices for the Enhancement of Digital Audio (20-a-001) and SWGDE Best Practices for Forensic Audio (08-a-001).
Establish an objective baseline state and plan the workflow so processing does not introduce clipping, artifacts, or false “done” confidence.
Use scripts/preflight_audio.py to capture baseline metrics and preserve the report with the case file.
Capture and record before processing:
- Record evidence identity and integrity: path, filename, file size, SHA-256 checksum, source, format/container, codec
- Record signal integrity: sample rate, bit depth, channels, duration
- Measure baseline loudness and levels: LUFS/LKFS, true peak, peak, RMS, dynamic range, DC offset
- Detect clipping and document clipped-sample percentage, peak headroom, exact time ranges
- Identify noise profile: stationary vs non-stationary, dominant noise bands, SNR estimate
- Locate the region of interest (ROI) and document time ranges and changes over time
- Inspect spectral content and estimate speech-band energy and intelligibility risk
- Scan for temporal defects: dropouts, discontinuities, splices, drift
- Evaluate channel correlation and phase anomalies (if stereo)
- Extract and preserve metadata: timestamps, device/model tags, embedded notes
Procedure:
- Prepare a forensic working copy, verify hashes, and preserve the original untouched.
- Locate ROI and target signal; document exact time ranges and changes across the recording.
- Assess challenges to intelligibility and signal quality; map challenges to mitigation strategies.
- Identify required processing and plan a workflow order that avoids unwanted artifacts.
Generate a plan draft with
scripts/plan_from_preflight.pyand complete it with case-specific decisions. - Measure baseline loudness and true peak per ITU-R BS.1770 / EBU R 128 and record peak/RMS/DC offset.
- Detect clipping and dropouts; if clipping is present, declip first or pause and document limitations.
- Inspect spectral content and noise type; collect representative noise profile segments and estimate SNR.
- If stereo, evaluate channel correlation and phase; document anomalies.
- Create a baseline listening log (multiple devices) and define success criteria for intelligibility and listenability.
Failure-pattern guardrails:
- Do not process until every preflight field is captured.
- Document every process, setting, software version, and time segment to enable repeatability.
- Compare each processed output to the unprocessed input and assess progress toward intelligibility and listenability.
- Avoid over-processing; review removed signal (filter residue) to avoid removing target signal components.
- Keep intermediate files uncompressed and preserve sample rate/bit depth when moving between tools.
- Perform a final review against the original; if unsatisfactory, revise or stop and report limitations.
- If the request is not achievable, communicate limitations and do not declare completion.
- Require objective metrics and A/B listening before declaring completion.
- Do not rely solely on objective metrics; corroborate with critical listening.
- Take listening breaks to avoid ear fatigue during extended reviews.
Quick Enhancement Pipeline
# 1. Analyze original (run preflight and capture baseline metrics)
python3 skills/.experimental/audio-voice-recovery/scripts/preflight_audio.py evidence.wav --out preflight.json
# 2. Create working copy with checksum
cp evidence.wav working.wav
sha256sum evidence.wav > evidence.sha256
# 3. Apply enhancement
ffmpeg -i working.wav -af "\
highpass=f=80,\
adeclick=w=55:o=75,\
afftdn=nr=12:nf=-30:nt=w,\
equalizer=f=2500:t=q:w=1:g=3,\
loudnorm=I=-16:TP=-1.5:LRA=11\
" enhanced.wav
# 4. Transcribe
whisper enhanced.wav --model large-v3 --language en
# 5. Verify original unchanged
sha256sum -c evidence.sha256
# 6. Verify improvement (objective comparison + A/B listening)
python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py \
--before evidence.wav \
--after enhanced.wav \
--format md \
--out comparison.md
How to Use
Read individual reference files for detailed explanations and code examples:
- Section definitions – Category structure and impact levels
- Rule template – Template for adding new rules
Reference Files
| File | Description |
|---|---|
| AGENTS.md | Complete compiled guide with all rules |
| references/_sections.md | Category definitions and ordering |
| assets/templates/_template.md | Template for new rules |
| metadata.json | Version and reference information |