axiom-hang-diagnostics
npx skills add https://github.com/charleswiltgen/axiom --skill axiom-hang-diagnostics
Agent 安装分布
Skill 文档
Hang Diagnostics
Systematic diagnosis and resolution of app hangs. A hang occurs when the main thread is blocked for more than 1 second, making the app unresponsive to user input.
Red Flags â Check This Skill When
| Symptom | This Skill Applies |
|---|---|
| App freezes briefly during use | Yes â likely hang |
| UI doesn’t respond to touches | Yes â main thread blocked |
| “App not responding” system dialog | Yes â severe hang |
| Xcode Organizer shows hang diagnostics | Yes â field hang reports |
| MetricKit MXHangDiagnostic received | Yes â aggregated hang data |
| Animations stutter or skip | Maybe â could be hitch, not hang |
| App feels slow but responsive | No â performance issue, not hang |
What Is a Hang
A hang is when the main runloop cannot process events for more than 1 second. The user taps, but nothing happens.
User taps â Main thread busy/blocked â Event queued â 1+ second delay â HANG
Key distinction: The main thread handles ALL user input. If it’s busy or blocked, the entire UI freezes.
Hang vs Hitch vs Lag
| Issue | Duration | User Experience | Tool |
|---|---|---|---|
| Hang | >1 second | App frozen, unresponsive | Time Profiler, System Trace |
| Hitch | 1-3 frames (16-50ms) | Animation stutters | Animation Hitches instrument |
| Lag | 100-500ms | Feels slow but responsive | Time Profiler |
This skill covers hangs. For hitches, see axiom-swiftui-performance. For general lag, see axiom-performance-profiling.
The Two Causes of Hangs
Every hang has one of two root causes:
1. Main Thread Busy
The main thread is doing work instead of processing events.
Subcategories:
| Type | Example | Fix |
|---|---|---|
| Proactive work | Pre-computing data user hasn’t requested | Lazy initialization, compute on demand |
| Irrelevant work | Processing all notifications, not just relevant ones | Filter notifications, targeted observers |
| Suboptimal API | Using blocking API when async exists | Switch to async API |
2. Main Thread Blocked
The main thread is waiting for something else.
Subcategories:
| Type | Example | Fix |
|---|---|---|
| Synchronous IPC | Calling system service synchronously | Use async API variant |
| File I/O | Data(contentsOf:) on main thread |
Move to background queue |
| Network | Synchronous URL request | Use URLSession async |
| Lock contention | Waiting for lock held by background thread | Reduce critical section, use actors |
| Semaphore/dispatch_sync | Blocking on background work | Restructure to async completion |
Decision Tree â Diagnosing Hangs
START: App hangs reported
â
âââ Do you have hang diagnostics from Organizer or MetricKit?
â â
â âââ YES: Examine stack trace
â â â
â â âââ Stack shows your code running
â â â â BUSY: Main thread doing work
â â â â Profile with Time Profiler
â â â
â â âââ Stack shows waiting (semaphore, lock, dispatch_sync)
â â â BLOCKED: Main thread waiting
â â â Profile with System Trace
â â
â âââ NO: Can you reproduce?
â â
â âââ YES: Profile with Time Profiler first
â â â
â â âââ High CPU on main thread
â â â â BUSY: Optimize the work
â â â
â â âââ Low CPU, thread blocked
â â â Use System Trace to find what's blocking
â â
â âââ NO: Enable MetricKit in app
â â Wait for field reports
â â Check Organizer > Hangs
Tool Selection
| Scenario | Primary Tool | Why |
|---|---|---|
| Reproduces locally | Time Profiler | See exactly what main thread is doing |
| Blocked thread suspected | System Trace | Shows thread state, lock contention |
| Field reports only | Xcode Organizer | Aggregated hang diagnostics |
| Want in-app data | MetricKit | MXHangDiagnostic with call stacks |
| Need precise timing | System Trace | Nanosecond-level thread analysis |
Time Profiler Workflow for Hangs
- Launch Instruments â Select Time Profiler template
- Record during hang â Reproduce the freeze
- Stop recording â Find the hang period in timeline
- Select hang region â Drag to select frozen timespan
- Examine call tree â Look for main thread work
What to look for:
- Functions with high “Self Time” on main thread
- Unexpectedly deep call stacks
- System calls that shouldn’t be on main thread
System Trace Workflow for Blocked Hangs
- Launch Instruments â Select System Trace template
- Record during hang â Capture thread states
- Find main thread â Filter to main thread
- Look for red/orange â Blocked states
- Examine blocking reason â Lock, semaphore, IPC
Thread states:
- Running (blue): Executing code
- Preempted (orange): Runnable but not scheduled
- Blocked (red): Waiting for resource
Common Hang Patterns and Fixes
Pattern 1: Synchronous File I/O
Before (hangs):
// Main thread blocks on file read
func loadUserData() {
let data = try! Data(contentsOf: largeFileURL) // BLOCKS
processData(data)
}
After (async):
func loadUserData() {
Task.detached {
let data = try Data(contentsOf: largeFileURL)
await MainActor.run {
self.processData(data)
}
}
}
Pattern 2: Unfiltered Notification Observer
Before (processes all):
NotificationCenter.default.addObserver(
self,
selector: #selector(handleChange),
name: .NSManagedObjectContextObjectsDidChange,
object: nil // Receives ALL contexts
)
After (filtered):
NotificationCenter.default.addObserver(
self,
selector: #selector(handleChange),
name: .NSManagedObjectContextObjectsDidChange,
object: relevantContext // Only this context
)
Pattern 3: Expensive Formatter Creation
Before (creates each time):
func formatDate(_ date: Date) -> String {
let formatter = DateFormatter() // EXPENSIVE
formatter.dateStyle = .medium
return formatter.string(from: date)
}
After (cached):
private static let dateFormatter: DateFormatter = {
let formatter = DateFormatter()
formatter.dateStyle = .medium
return formatter
}()
func formatDate(_ date: Date) -> String {
Self.dateFormatter.string(from: date)
}
Pattern 4: dispatch_sync to Main Thread
Before (deadlock risk):
// From background thread
DispatchQueue.main.sync { // BLOCKS if main is blocked
updateUI()
}
After (async):
DispatchQueue.main.async {
self.updateUI()
}
Pattern 5: Semaphore for Async Result
Before (blocks main thread):
func fetchDataSync() -> Data {
let semaphore = DispatchSemaphore(value: 0)
var result: Data?
URLSession.shared.dataTask(with: url) { data, _, _ in
result = data
semaphore.signal()
}.resume()
semaphore.wait() // BLOCKS MAIN THREAD
return result!
}
After (async/await):
func fetchData() async throws -> Data {
let (data, _) = try await URLSession.shared.data(from: url)
return data
}
Pattern 6: Lock Contention
Before (shared lock):
class DataManager {
private let lock = NSLock()
private var cache: [String: Data] = [:]
func getData(for key: String) -> Data? {
lock.lock() // Main thread waits for background
defer { lock.unlock() }
return cache[key]
}
}
After (actor):
actor DataManager {
private var cache: [String: Data] = [:]
func getData(for key: String) -> Data? {
cache[key] // Actor serializes access safely
}
}
Pattern 7: App Launch Hang (Watchdog)
Before (too much work):
func application(_ application: UIApplication,
didFinishLaunchingWithOptions launchOptions: [UIApplication.LaunchOptionsKey: Any]?) -> Bool {
loadAllUserData() // Expensive
setupAnalytics() // Network calls
precomputeLayouts() // CPU intensive
return true
}
After (deferred):
func application(_ application: UIApplication,
didFinishLaunchingWithOptions launchOptions: [UIApplication.LaunchOptionsKey: Any]?) -> Bool {
// Only essential setup
setupMinimalUI()
return true
}
func applicationDidBecomeActive(_ application: UIApplication) {
// Defer non-essential work
Task {
await loadUserDataInBackground()
}
}
Pattern 8: Image Processing on Main Thread
Before (blocks UI):
func processImage(_ image: UIImage) {
let filtered = applyExpensiveFilter(image) // BLOCKS
imageView.image = filtered
}
After (background processing):
func processImage(_ image: UIImage) {
imageView.image = placeholder
Task.detached(priority: .userInitiated) {
let filtered = applyExpensiveFilter(image)
await MainActor.run {
self.imageView.image = filtered
}
}
}
Xcode Organizer Hang Diagnostics
Window > Organizer > Select App > Hangs
The Organizer shows aggregated hang data from users who opted into sharing diagnostics.
Reading the report:
- Hang Rate: Hangs per day per device
- Call Stack: Where the hang occurred
- Device/OS breakdown: Which configurations affected
Interpreting call stacks:
- Your code at top: Main thread busy with your work
- System API at top: You called blocking API on main thread
- pthread_mutex/semaphore: Lock contention or explicit waiting
MetricKit Hang Diagnostics
Adopt MetricKit to receive hang diagnostics in your app:
import MetricKit
class MetricsSubscriber: NSObject, MXMetricManagerSubscriber {
func didReceive(_ payloads: [MXDiagnosticPayload]) {
for payload in payloads {
if let hangDiagnostics = payload.hangDiagnostics {
for diagnostic in hangDiagnostics {
analyzeHang(diagnostic)
}
}
}
}
private func analyzeHang(_ diagnostic: MXHangDiagnostic) {
// Duration of the hang
let duration = diagnostic.hangDuration
// Call stack tree (needs symbolication)
let callStack = diagnostic.callStackTree
// Send to your analytics
uploadHangDiagnostic(duration: duration, callStack: callStack)
}
}
Key MXHangDiagnostic properties:
hangDuration: How long the hang lastedcallStackTree: MXCallStackTree with framessignatureIdentifier: For grouping similar hangs
Watchdog Terminations
The watchdog kills apps that hang during key transitions:
| Transition | Time Limit | Consequence |
|---|---|---|
| App launch | ~20 seconds | App killed, crash logged |
| Background transition | ~5 seconds | App killed |
| Foreground transition | ~10 seconds | App killed |
Watchdog disabled in:
- Simulator
- Debugger attached
- Development builds (sometimes)
Watchdog kills are logged as crashes with exception type EXC_CRASH (SIGKILL) and termination reason Namespace RUNNINGBOARD, Code 3735883980 (hex 0xDEAD10CC â indicates app held a file lock or SQLite database lock while being suspended).
Pressure Scenarios
Scenario 1: Manager Says “Just Add a Loading Spinner”
Situation: App hangs during data load. Manager suggests adding spinner to “fix” it.
Why this fails: Adding a spinner doesn’t prevent the hangâthe UI still freezes, the spinner won’t animate, and the app remains unresponsive.
Correct response: “A spinner won’t animate during a hang because the main thread is blocked. We need to move this work off the main thread so the spinner can actually spin and the app stays responsive.”
Scenario 2: “It Works Fine in Testing”
Situation: QA can’t reproduce the hang. Logs show it happens in production.
Analysis:
- Field devices have different data sizes
- Network conditions vary (slow connection = longer sync)
- Background apps consume memory/CPU
- Watchdog is disabled in debug builds
Action:
- Add MetricKit to capture field diagnostics
- Test with production-sized datasets
- Test without debugger attached
- Check Organizer for hang reports
Scenario 3: “We’ve Always Done It This Way”
Situation: Legacy code calls synchronous API on main thread. Refactoring is “too risky.”
Why it matters: Even if it worked before:
- Data may have grown larger
- OS updates may have changed timing
- New devices have different characteristics
- Users notice more as apps get faster
Approach:
- Add metrics to measure current hang rate
- Refactor incrementally with feature flags
- A/B test to show improvement
- Document risk of not fixing
Anti-Patterns to Avoid
| Anti-Pattern | Why It’s Wrong | Instead |
|---|---|---|
DispatchQueue.main.sync from background |
Can deadlock, always blocks | Use .async |
| Semaphore to convert async to sync | Blocks calling thread | Stay async with completion/await |
| File I/O on main thread | Unpredictable latency | Background queue |
| Unfiltered notification observer | Processes irrelevant events | Filter by object/name |
| Creating formatters in loops | Expensive initialization | Cache and reuse |
| Synchronous network request | Blocks on network latency | URLSession async |
Hang Prevention Checklist
Before shipping, verify:
- No
Data(contentsOf:)or file reads on main thread - No
DispatchQueue.main.syncfrom background threads - No semaphore.wait() on main thread
- Formatters (DateFormatter, NumberFormatter) are cached
- Notification observers filter appropriately
- Launch work is minimized (defer non-essential)
- Image processing happens off main thread
- Database queries don’t run on main thread
- MetricKit adopted for field diagnostics
Resources
WWDC: 2021-10258, 2022-10082
Docs: /xcode/analyzing-responsiveness-issues-in-your-shipping-app, /metrickit/mxhangdiagnostic
Skills: axiom-metrickit-ref, axiom-performance-profiling, axiom-swift-concurrency