metal-gpu
1
总安装量
1
周安装量
#53122
全站排名
安装命令
npx skills add https://github.com/ios-agent/iosagent.dev --skill metal-gpu
Agent 安装分布
amp
1
opencode
1
kimi-cli
1
codex
1
gemini-cli
1
Skill 文档
Metal GPU Code Skill
Write production-quality Metal code with correct patterns, optimal performance, and clear explanations.
When to Read References
For detailed API topology, Metal 4 specifics, and Apple Silicon optimization patterns, read:
/mnt/skills/user/metal-gpu/references/metal-api-guide.md
Core Principles
- Always start with the device:
MTLCreateSystemDefaultDevice()â every Metal workflow begins here - Command pattern: Device â Command Queue â Command Buffer â Command Encoder â Commit
- Shaders are MSL (Metal Shading Language): C++14-based, with Metal-specific types and attributes
- Resource management matters: Use appropriate storage modes, avoid unnecessary copies
- Triple buffering for render loops to keep CPU and GPU in parallel
Quick Reference: Metal Command Pipeline
MTLDevice
ââ makeCommandQueue() â MTLCommandQueue
ââ makeCommandBuffer() â MTLCommandBuffer
ââ makeRenderCommandEncoder(descriptor:) â MTLRenderCommandEncoder
ââ makeComputeCommandEncoder() â MTLComputeCommandEncoder
ââ makeBlitCommandEncoder() â MTLBlitCommandEncoder
Writing Shaders (MSL)
Use Metal Shading Language. Always include:
#include <metal_stdlib>andusing namespace metal;- Correct attribute qualifiers:
[[vertex_id]],[[position]],[[stage_in]],[[buffer(n)]],[[texture(n)]] - Proper address space qualifiers:
device,constant,threadgroup,thread
Vertex Shader Pattern
#include <metal_stdlib>
using namespace metal;
struct VertexIn {
float3 position [[attribute(0)]];
float3 normal [[attribute(1)]];
float2 texCoord [[attribute(2)]];
};
struct VertexOut {
float4 position [[position]];
float3 normal;
float2 texCoord;
};
vertex VertexOut vertex_main(VertexIn in [[stage_in]],
constant float4x4 &mvp [[buffer(1)]]) {
VertexOut out;
out.position = mvp * float4(in.position, 1.0);
out.normal = in.normal;
out.texCoord = in.texCoord;
return out;
}
Fragment Shader Pattern
fragment float4 fragment_main(VertexOut in [[stage_in]],
texture2d<float> albedo [[texture(0)]],
sampler texSampler [[sampler(0)]]) {
float4 color = albedo.sample(texSampler, in.texCoord);
return color;
}
Compute Kernel Pattern
kernel void compute_main(device float *input [[buffer(0)]],
device float *output [[buffer(1)]],
uint id [[thread_position_in_grid]]) {
output[id] = input[id] * 2.0;
}
Swift-Side Setup Patterns
Render Pipeline Setup
let device = MTLCreateSystemDefaultDevice()!
let commandQueue = device.makeCommandQueue()!
// Load shaders
let library = device.makeDefaultLibrary()!
let vertexFunction = library.makeFunction(name: "vertex_main")
let fragmentFunction = library.makeFunction(name: "fragment_main")
// Pipeline descriptor
let pipelineDescriptor = MTLRenderPipelineDescriptor()
pipelineDescriptor.vertexFunction = vertexFunction
pipelineDescriptor.fragmentFunction = fragmentFunction
pipelineDescriptor.colorAttachments[0].pixelFormat = .bgra8Unorm
// Vertex descriptor
let vertexDescriptor = MTLVertexDescriptor()
vertexDescriptor.attributes[0].format = .float3 // position
vertexDescriptor.attributes[0].offset = 0
vertexDescriptor.attributes[0].bufferIndex = 0
vertexDescriptor.layouts[0].stride = MemoryLayout<SIMD3<Float>>.stride
pipelineDescriptor.vertexDescriptor = vertexDescriptor
let pipelineState = try! device.makeRenderPipelineState(descriptor: pipelineDescriptor)
Compute Pipeline Setup
let computeFunction = library.makeFunction(name: "compute_main")!
let computePipeline = try! device.makeComputePipelineState(function: computeFunction)
let commandBuffer = commandQueue.makeCommandBuffer()!
let encoder = commandBuffer.makeComputeCommandEncoder()!
encoder.setComputePipelineState(computePipeline)
encoder.setBuffer(inputBuffer, offset: 0, index: 0)
encoder.setBuffer(outputBuffer, offset: 0, index: 1)
let gridSize = MTLSize(width: elementCount, height: 1, depth: 1)
let threadGroupSize = MTLSize(
width: min(computePipeline.maxTotalThreadsPerThreadgroup, elementCount),
height: 1, depth: 1
)
encoder.dispatchThreads(gridSize, threadsPerThreadgroup: threadGroupSize)
encoder.endEncoding()
commandBuffer.commit()
MetalKit View Rendering
import MetalKit
class Renderer: NSObject, MTKViewDelegate {
let device: MTLDevice
let commandQueue: MTLCommandQueue
let pipelineState: MTLRenderPipelineState
func draw(in view: MTKView) {
guard let drawable = view.currentDrawable,
let descriptor = view.currentRenderPassDescriptor else { return }
let commandBuffer = commandQueue.makeCommandBuffer()!
let encoder = commandBuffer.makeRenderCommandEncoder(descriptor: descriptor)!
encoder.setRenderPipelineState(pipelineState)
// Set buffers, draw primitives...
encoder.drawPrimitives(type: .triangle, vertexStart: 0, vertexCount: 3)
encoder.endEncoding()
commandBuffer.present(drawable)
commandBuffer.commit()
}
}
Performance Best Practices
- Storage modes: Use
.sharedon Apple Silicon (unified memory),.privatefor GPU-only data,.managedon Intel Macs - Triple buffering: Rotate 3 buffers with a semaphore to avoid CPU/GPU stalls
- Avoid per-frame allocations: Reuse buffers and command encoders
- Use
dispatchThreadsoverdispatchThreadgroupswhen possible (Apple Silicon) - Prefer tile-based deferred rendering patterns on Apple GPUs â use imageblocks and tile shaders
- Compile pipelines ahead of time: Pipeline creation is expensive, do it at load time
- Use Metal GPU frame capture in Xcode to profile and debug
Common Mistakes to Avoid
- Forgetting
encoder.endEncoding()before committing - Mismatched buffer indices between Swift and MSL
- Using wrong pixel format for render targets
- Not handling
nilfrom optional Metal API calls - Blocking the main thread waiting for GPU completion â use
addCompletedHandlerinstead - Forgetting to set the vertex descriptor when using
[[stage_in]]
Metal 4 Notes
Metal 4 introduces a modernized core API. Key changes:
- New compilation API for finer shader compilation control
- Updated command encoding patterns
- See
references/metal-api-guide.mdfor the full Metal 4 API topology
Frameworks Ecosystem
| Framework | Purpose |
|---|---|
| Metal | Direct GPU access, shaders, pipelines |
| MetalKit | View management, texture loading, model I/O |
| MetalFX | Upscaling (temporal/spatial) for performance |
| Metal Performance Shaders | Optimized compute & image processing kernels |
| Compositor Services | Stereoscopic rendering for visionOS |
| RealityKit | High-level 3D rendering (uses Metal underneath) |