malware-forensics
npx skills add https://github.com/sherifeldeeb/agentskills --skill malware-forensics
Agent 安装分布
Skill 文档
Malware Forensics
Comprehensive malware forensics skill for analyzing malicious software samples. Enables static and dynamic analysis, extraction of indicators of compromise, attribution research, and documentation of malware capabilities for incident response and threat intelligence.
Capabilities
- Static Analysis: Analyze malware without execution (strings, headers, imports)
- PE Analysis: Parse Windows executables, DLLs, and drivers
- Document Analysis: Analyze malicious Office documents and PDFs
- Script Analysis: Analyze malicious scripts (PowerShell, VBA, JavaScript)
- IOC Extraction: Extract IPs, domains, URLs, hashes, and other indicators
- YARA Scanning: Scan samples with YARA rules for identification
- String Analysis: Extract and categorize strings from samples
- Behavior Analysis: Document observed malware behavior
- Unpacking Support: Identify and document packed samples
- Attribution Analysis: Link samples to threat actors or campaigns
Quick Start
from malware_forensics import MalwareAnalyzer, PEAnalyzer, IOCExtractor
# Analyze sample
analyzer = MalwareAnalyzer("/samples/malware.exe")
report = analyzer.analyze()
# Extract IOCs
extractor = IOCExtractor("/samples/malware.exe")
iocs = extractor.extract_all()
# Scan with YARA
matches = analyzer.yara_scan("/rules/malware.yar")
Usage
Task 1: PE File Analysis
Input: Windows executable (EXE, DLL, SYS)
Process:
- Parse PE headers
- Analyze imports/exports
- Check for anomalies
- Extract resources
- Calculate hashes
Output: PE analysis report
Example:
from malware_forensics import PEAnalyzer
# Initialize PE analyzer
analyzer = PEAnalyzer("/samples/suspicious.exe")
# Get basic info
info = analyzer.get_basic_info()
print(f"File: {info.filename}")
print(f"Size: {info.size}")
print(f"MD5: {info.md5}")
print(f"SHA256: {info.sha256}")
print(f"SSDeep: {info.ssdeep}")
print(f"Type: {info.file_type}")
# Get PE headers
headers = analyzer.get_headers()
print(f"Machine: {headers.machine}")
print(f"Timestamp: {headers.timestamp}")
print(f"Subsystem: {headers.subsystem}")
print(f"Entry point: 0x{headers.entry_point:x}")
print(f"Image base: 0x{headers.image_base:x}")
# Get sections
sections = analyzer.get_sections()
for section in sections:
print(f"Section: {section.name}")
print(f" Virtual size: {section.virtual_size}")
print(f" Raw size: {section.raw_size}")
print(f" Entropy: {section.entropy}")
print(f" MD5: {section.md5}")
# Get imports
imports = analyzer.get_imports()
for dll, functions in imports.items():
print(f"Import: {dll}")
for func in functions:
print(f" - {func}")
# Get exports
exports = analyzer.get_exports()
for export in exports:
print(f"Export: {export.name} @ {export.ordinal}")
# Detect anomalies
anomalies = analyzer.detect_anomalies()
for a in anomalies:
print(f"ANOMALY: {a.type}")
print(f" Description: {a.description}")
print(f" Severity: {a.severity}")
# Get resources
resources = analyzer.get_resources()
for resource in resources:
print(f"Resource: {resource.type}/{resource.name}")
print(f" Size: {resource.size}")
print(f" Language: {resource.language}")
# Check for packing
packing = analyzer.detect_packing()
print(f"Packed: {packing.is_packed}")
print(f"Packer: {packing.packer_name}")
print(f"Confidence: {packing.confidence}")
# Generate report
analyzer.generate_report("/evidence/pe_analysis.html")
Task 2: String Analysis
Input: Malware sample
Process:
- Extract ASCII/Unicode strings
- Categorize by type
- Identify suspicious strings
- Extract encoded strings
- Document findings
Output: String analysis with categorization
Example:
from malware_forensics import StringAnalyzer
# Initialize string analyzer
analyzer = StringAnalyzer("/samples/malware.exe")
# Extract all strings
strings = analyzer.extract_all(min_length=4)
print(f"Total strings: {len(strings)}")
# Get strings by category
categorized = analyzer.categorize()
print(f"URLs: {len(categorized.urls)}")
for url in categorized.urls:
print(f" {url}")
print(f"IPs: {len(categorized.ips)}")
for ip in categorized.ips:
print(f" {ip}")
print(f"Domains: {len(categorized.domains)}")
for domain in categorized.domains:
print(f" {domain}")
print(f"Registry keys: {len(categorized.registry)}")
for reg in categorized.registry:
print(f" {reg}")
print(f"File paths: {len(categorized.file_paths)}")
for path in categorized.file_paths:
print(f" {path}")
# Find suspicious strings
suspicious = analyzer.find_suspicious()
for s in suspicious:
print(f"SUSPICIOUS: {s.value}")
print(f" Category: {s.category}")
print(f" Reason: {s.reason}")
# Decode encoded strings
decoded = analyzer.decode_strings()
for d in decoded:
print(f"Encoded: {d.encoded[:50]}...")
print(f" Encoding: {d.encoding}")
print(f" Decoded: {d.decoded}")
# Find strings with XOR patterns
xor_strings = analyzer.find_xor_encoded()
for x in xor_strings:
print(f"XOR key 0x{x.key:02x}: {x.decoded}")
# Export strings
analyzer.export("/evidence/strings.txt")
analyzer.export_json("/evidence/strings.json")
Task 3: Document Analysis
Input: Malicious document (Office, PDF)
Process:
- Parse document structure
- Extract macros/scripts
- Analyze embedded objects
- Detect exploits
- Extract payloads
Output: Document analysis report
Example:
from malware_forensics import DocumentAnalyzer
# Analyze Office document
analyzer = DocumentAnalyzer("/samples/malicious.docx")
# Get document info
info = analyzer.get_info()
print(f"Format: {info.format}")
print(f"Created: {info.created}")
print(f"Modified: {info.modified}")
print(f"Author: {info.author}")
print(f"Has macros: {info.has_macros}")
print(f"Has embedded: {info.has_embedded}")
# Extract macros
macros = analyzer.extract_macros()
for macro in macros:
print(f"Macro: {macro.name}")
print(f" Type: {macro.type}")
print(f" Code preview: {macro.code[:200]}...")
print(f" Suspicious: {macro.is_suspicious}")
# Analyze VBA code
vba_analysis = analyzer.analyze_vba()
for finding in vba_analysis.findings:
print(f"VBA Finding: {finding.type}")
print(f" Description: {finding.description}")
print(f" Code: {finding.code_snippet}")
# Get auto-execute triggers
triggers = analyzer.get_auto_triggers()
for trigger in triggers:
print(f"Trigger: {trigger.name}")
print(f" Type: {trigger.trigger_type}")
# Extract embedded objects
embedded = analyzer.extract_embedded("/evidence/embedded/")
for obj in embedded:
print(f"Embedded: {obj.filename}")
print(f" Type: {obj.content_type}")
print(f" SHA256: {obj.sha256}")
# Detect exploits
exploits = analyzer.detect_exploits()
for exploit in exploits:
print(f"EXPLOIT: {exploit.cve}")
print(f" Description: {exploit.description}")
print(f" Confidence: {exploit.confidence}")
# PDF-specific analysis
if info.format == "PDF":
pdf_analysis = analyzer.analyze_pdf_structure()
print(f"JavaScript: {pdf_analysis.has_javascript}")
print(f"OpenAction: {pdf_analysis.has_openaction}")
print(f"Embedded files: {pdf_analysis.embedded_files}")
# Generate report
analyzer.generate_report("/evidence/document_analysis.html")
Task 4: Script Analysis
Input: Malicious script file
Process:
- Identify script type
- Deobfuscate code
- Analyze behavior
- Extract IOCs
- Document capabilities
Output: Script analysis report
Example:
from malware_forensics import ScriptAnalyzer
# Analyze script
analyzer = ScriptAnalyzer("/samples/malicious.ps1")
# Get script info
info = analyzer.get_info()
print(f"Type: {info.script_type}")
print(f"Size: {info.size}")
print(f"Encoding: {info.encoding}")
print(f"Obfuscated: {info.is_obfuscated}")
# Deobfuscate script
deobfuscated = analyzer.deobfuscate()
print(f"Deobfuscation stages: {deobfuscated.stages}")
print(f"Final code preview: {deobfuscated.final_code[:500]}...")
# Analyze PowerShell-specific features
if info.script_type == "PowerShell":
ps_analysis = analyzer.analyze_powershell()
print(f"Download cradles: {ps_analysis.download_cradles}")
print(f"Encoded commands: {ps_analysis.encoded_commands}")
print(f"Bypass techniques: {ps_analysis.bypass_techniques}")
# Get suspicious patterns
patterns = analyzer.find_suspicious_patterns()
for p in patterns:
print(f"Pattern: {p.name}")
print(f" Description: {p.description}")
print(f" Code: {p.code_snippet}")
print(f" MITRE: {p.mitre_technique}")
# Analyze JavaScript
if info.script_type == "JavaScript":
js_analysis = analyzer.analyze_javascript()
print(f"Eval calls: {js_analysis.eval_calls}")
print(f"Document.write: {js_analysis.document_writes}")
print(f"External requests: {js_analysis.external_requests}")
# Extract IOCs
iocs = analyzer.extract_iocs()
print(f"URLs: {iocs.urls}")
print(f"Domains: {iocs.domains}")
print(f"IPs: {iocs.ips}")
# Get execution flow
flow = analyzer.analyze_execution_flow()
for step in flow:
print(f"Step {step.order}: {step.description}")
print(f" Action: {step.action}")
# Generate report
analyzer.generate_report("/evidence/script_analysis.html")
Task 5: IOC Extraction
Input: Malware sample
Process:
- Extract network IOCs
- Extract file IOCs
- Extract registry IOCs
- Deduplicate and validate
- Export in multiple formats
Output: IOC collection
Example:
from malware_forensics import IOCExtractor
# Initialize extractor
extractor = IOCExtractor("/samples/malware.exe")
# Extract all IOCs
iocs = extractor.extract_all()
# Network IOCs
print(f"URLs ({len(iocs.urls)}):")
for url in iocs.urls:
print(f" {url.value}")
print(f" Context: {url.context}")
print(f" Confidence: {url.confidence}")
print(f"Domains ({len(iocs.domains)}):")
for domain in iocs.domains:
print(f" {domain.value}")
print(f"IPs ({len(iocs.ips)}):")
for ip in iocs.ips:
print(f" {ip.value}")
print(f" Type: {ip.ip_type}") # C2, download, etc.
# File IOCs
print(f"File paths ({len(iocs.file_paths)}):")
for path in iocs.file_paths:
print(f" {path.value}")
print(f"File hashes ({len(iocs.hashes)}):")
for h in iocs.hashes:
print(f" {h.algorithm}: {h.value}")
# Registry IOCs
print(f"Registry keys ({len(iocs.registry_keys)}):")
for reg in iocs.registry_keys:
print(f" {reg.value}")
# Mutexes
print(f"Mutexes ({len(iocs.mutexes)}):")
for mutex in iocs.mutexes:
print(f" {mutex.value}")
# Validate IOCs
validated = extractor.validate_iocs(iocs)
print(f"Valid IOCs: {validated.valid_count}")
print(f"Invalid IOCs: {validated.invalid_count}")
# Enrich IOCs
enriched = extractor.enrich_iocs(
iocs,
sources=["virustotal", "threatfox", "urlhaus"]
)
# Export IOCs
extractor.export_csv("/evidence/iocs.csv")
extractor.export_json("/evidence/iocs.json")
extractor.export_stix("/evidence/iocs.stix")
extractor.export_misp("/evidence/iocs.misp.json")
Task 6: YARA Scanning
Input: Malware sample(s) and YARA rules
Process:
- Compile YARA rules
- Scan samples
- Collect matches
- Document findings
- Generate report
Output: YARA scan results
Example:
from malware_forensics import YARAScanner
# Initialize scanner
scanner = YARAScanner()
# Add rule files
scanner.add_rules("/rules/malware_families.yar")
scanner.add_rules("/rules/packers.yar")
scanner.add_rules("/rules/exploits.yar")
# Add rule directory
scanner.add_rule_directory("/rules/")
# Scan single file
matches = scanner.scan_file("/samples/malware.exe")
for match in matches:
print(f"Rule: {match.rule}")
print(f" Namespace: {match.namespace}")
print(f" Tags: {match.tags}")
print(f" Meta: {match.meta}")
print(f" Strings matched:")
for s in match.strings:
print(f" {s.identifier}: {s.data} @ 0x{s.offset:x}")
# Scan directory
dir_matches = scanner.scan_directory("/samples/")
for file_path, matches in dir_matches.items():
if matches:
print(f"File: {file_path}")
for match in matches:
print(f" - {match.rule}")
# Scan with specific rules
specific = scanner.scan_file(
"/samples/malware.exe",
rules=["APT_Malware", "Ransomware"]
)
# Get statistics
stats = scanner.get_statistics()
print(f"Files scanned: {stats.files_scanned}")
print(f"Rules loaded: {stats.rules_loaded}")
print(f"Matches found: {stats.total_matches}")
# Export results
scanner.export_results("/evidence/yara_results.json")
scanner.generate_report("/evidence/yara_report.html")
Task 7: Behavior Analysis
Input: Malware execution observations
Process:
- Document file operations
- Track registry changes
- Monitor network activity
- Identify persistence
- Map to MITRE ATT&CK
Output: Behavior analysis report
Example:
from malware_forensics import BehaviorAnalyzer
# Initialize analyzer with sandbox report
analyzer = BehaviorAnalyzer()
analyzer.load_sandbox_report("/evidence/sandbox_report.json")
# Or manually add observations
analyzer.add_file_operation("create", "C:\\Windows\\Temp\\malware.exe")
analyzer.add_registry_operation("create", "HKCU\\Software\\Microsoft\\Windows\\CurrentVersion\\Run\\Malware")
analyzer.add_network_connection("tcp", "203.0.113.50", 443)
analyzer.add_process_creation("cmd.exe", "/c whoami")
# Get file operations
file_ops = analyzer.get_file_operations()
for op in file_ops:
print(f"File: {op.operation} - {op.path}")
print(f" Time: {op.timestamp}")
# Get registry operations
reg_ops = analyzer.get_registry_operations()
for op in reg_ops:
print(f"Registry: {op.operation} - {op.key}")
print(f" Value: {op.value}")
# Get network activity
network = analyzer.get_network_activity()
for conn in network:
print(f"Network: {conn.protocol} {conn.destination}:{conn.port}")
print(f" DNS: {conn.dns_query}")
# Get process activity
processes = analyzer.get_process_activity()
for proc in processes:
print(f"Process: {proc.name}")
print(f" Command: {proc.command_line}")
print(f" Parent: {proc.parent}")
# Map to MITRE ATT&CK
mitre = analyzer.map_to_mitre()
for technique in mitre:
print(f"Technique: {technique.id} - {technique.name}")
print(f" Tactic: {technique.tactic}")
print(f" Evidence: {technique.evidence}")
# Identify capabilities
capabilities = analyzer.identify_capabilities()
for cap in capabilities:
print(f"Capability: {cap.name}")
print(f" Description: {cap.description}")
print(f" Confidence: {cap.confidence}")
# Generate behavior report
analyzer.generate_report("/evidence/behavior_analysis.html")
Task 8: Sample Comparison
Input: Multiple malware samples
Process:
- Calculate similarity hashes
- Compare code sections
- Identify shared IOCs
- Find common patterns
- Cluster related samples
Output: Sample comparison results
Example:
from malware_forensics import SampleComparator
# Initialize comparator
comparator = SampleComparator()
# Add samples
comparator.add_sample("/samples/sample1.exe")
comparator.add_sample("/samples/sample2.exe")
comparator.add_sample("/samples/sample3.exe")
# Or add directory
comparator.add_directory("/samples/")
# Compare all samples
comparison = comparator.compare_all()
# Get similarity matrix
matrix = comparison.similarity_matrix
for sample1, similarities in matrix.items():
for sample2, score in similarities.items():
if score > 0.8:
print(f"{sample1} <-> {sample2}: {score:.2f}")
# Find clusters
clusters = comparator.cluster_samples(threshold=0.7)
for i, cluster in enumerate(clusters):
print(f"Cluster {i + 1}:")
for sample in cluster:
print(f" - {sample}")
# Compare specific samples
detail = comparator.compare_pair("/samples/a.exe", "/samples/b.exe")
print(f"Overall similarity: {detail.overall_score}")
print(f"Section similarity: {detail.section_scores}")
print(f"Import similarity: {detail.import_score}")
print(f"String similarity: {detail.string_score}")
print(f"Shared IOCs: {detail.shared_iocs}")
# Find shared code
shared_code = comparator.find_shared_code()
for code in shared_code:
print(f"Shared code block at offset {code.offset}")
print(f" Size: {code.size}")
print(f" Samples: {code.samples}")
# Generate comparison report
comparator.generate_report("/evidence/comparison_report.html")
Task 9: Attribution Analysis
Input: Malware sample with IOCs
Process:
- Match against known families
- Compare with threat intel
- Identify TTPs
- Find related campaigns
- Document attribution
Output: Attribution analysis
Example:
from malware_forensics import AttributionAnalyzer
# Initialize analyzer
analyzer = AttributionAnalyzer("/samples/malware.exe")
# Match against malware families
families = analyzer.match_malware_families()
for family in families:
print(f"Family: {family.name}")
print(f" Confidence: {family.confidence}")
print(f" Matching indicators: {family.indicators}")
# Check against threat intel
threat_intel = analyzer.check_threat_intel(
feeds=["malwarebazaar", "virustotal", "threatfox"]
)
for intel in threat_intel:
print(f"Intel: {intel.source}")
print(f" Family: {intel.family}")
print(f" Tags: {intel.tags}")
print(f" First seen: {intel.first_seen}")
# Match TTPs to threat actors
actors = analyzer.match_threat_actors()
for actor in actors:
print(f"Threat Actor: {actor.name}")
print(f" Confidence: {actor.confidence}")
print(f" Matching TTPs: {actor.matching_ttps}")
print(f" Known aliases: {actor.aliases}")
# Find related campaigns
campaigns = analyzer.find_related_campaigns()
for campaign in campaigns:
print(f"Campaign: {campaign.name}")
print(f" Time range: {campaign.start_date} - {campaign.end_date}")
print(f" Targets: {campaign.targets}")
# Get attribution summary
summary = analyzer.get_attribution_summary()
print(f"Most likely family: {summary.primary_family}")
print(f"Most likely actor: {summary.primary_actor}")
print(f"Confidence: {summary.overall_confidence}")
# Generate attribution report
analyzer.generate_report("/evidence/attribution.html")
Task 10: Malware Triage
Input: Collection of suspicious files
Process:
- Calculate hashes
- Check against known malware
- Quick static analysis
- Prioritize for analysis
- Generate triage report
Output: Triage results with priorities
Example:
from malware_forensics import MalwareTriage
# Initialize triage
triage = MalwareTriage()
# Add samples
triage.add_directory("/quarantine/")
# Run triage
results = triage.run()
print(f"Total samples: {results.total}")
print(f"Known malware: {results.known_malware}")
print(f"Suspicious: {results.suspicious}")
print(f"Clean: {results.clean}")
print(f"Unknown: {results.unknown}")
# Get prioritized list
prioritized = triage.get_prioritized()
for sample in prioritized:
print(f"Priority {sample.priority}: {sample.filename}")
print(f" Status: {sample.status}")
print(f" Reason: {sample.reason}")
print(f" Risk score: {sample.risk_score}")
# Get known malware
known = triage.get_known_malware()
for m in known:
print(f"Known: {m.filename}")
print(f" Family: {m.family}")
print(f" Detection: {m.detection_name}")
# Get suspicious files
suspicious = triage.get_suspicious()
for s in suspicious:
print(f"Suspicious: {s.filename}")
print(f" Indicators: {s.indicators}")
# Export triage results
triage.export_csv("/evidence/triage_results.csv")
triage.generate_report("/evidence/triage_report.html")
Configuration
Environment Variables
| Variable | Description | Required | Default |
|---|---|---|---|
YARA_RULES_PATH |
Default YARA rules directory | No | ./rules |
VT_API_KEY |
VirusTotal API key | No | None |
MALWARE_BAZAAR_KEY |
MalwareBazaar API key | No | None |
SANDBOX_API |
Sandbox service API URL | No | None |
Options
| Option | Type | Description |
|---|---|---|
auto_deobfuscate |
boolean | Auto-deobfuscate scripts |
extract_resources |
boolean | Extract PE resources |
deep_string_analysis |
boolean | Extended string analysis |
check_threat_intel |
boolean | Check against threat intel |
parallel |
boolean | Enable parallel processing |
Examples
Example 1: Incident Response Analysis
Scenario: Analyzing malware from compromised system
from malware_forensics import MalwareAnalyzer, IOCExtractor
# Analyze malware sample
analyzer = MalwareAnalyzer("/evidence/malware.exe")
analysis = analyzer.full_analysis()
# Extract IOCs for blocking
extractor = IOCExtractor("/evidence/malware.exe")
iocs = extractor.extract_all()
# Export for SIEM
extractor.export_stix("/evidence/block_iocs.stix")
# Generate IR report
analyzer.generate_ir_report("/evidence/malware_ir_report.html")
Example 2: Threat Intelligence
Scenario: Analyzing new malware for threat intel
from malware_forensics import MalwareAnalyzer, AttributionAnalyzer
# Full analysis
analyzer = MalwareAnalyzer("/samples/new_sample.exe")
analysis = analyzer.full_analysis()
# Attribution
attribution = AttributionAnalyzer("/samples/new_sample.exe")
actor = attribution.match_threat_actors()
campaigns = attribution.find_related_campaigns()
# Generate threat intel report
analyzer.generate_threat_intel_report("/evidence/threat_intel.html")
Limitations
- Static analysis cannot detect runtime behavior
- Packed samples may require manual unpacking
- Obfuscated code may resist analysis
- Attribution has inherent uncertainty
- Requires safe environment for handling samples
- Some formats may have limited support
- Threat intel depends on available data
Troubleshooting
Common Issue 1: PE Parsing Failure
Problem: Cannot parse PE file Solution:
- Check file integrity
- May be packed or corrupted
- Try different parsing options
Common Issue 2: Deobfuscation Failure
Problem: Script remains obfuscated Solution:
- Try manual deobfuscation
- Use dynamic analysis
- Check for custom obfuscation
Common Issue 3: YARA Rule Errors
Problem: YARA rules fail to compile Solution:
- Check rule syntax
- Verify string escape sequences
- Update YARA version
Related Skills
- memory-forensics: Memory-based malware analysis
- disk-forensics: Find malware artifacts on disk
- network-forensics: Analyze malware traffic
- timeline-forensics: Malware timeline integration
- artifact-collection: Sample collection procedures