pytorch

📁 g1joshi/agent-skills 📅 3 days ago
2
总安装量
2
周安装量
#72738
全站排名
安装命令
npx skills add https://github.com/g1joshi/agent-skills --skill pytorch

Agent 安装分布

mcpjam 2
claude-code 2
replit 2
junie 2
zencoder 2

Skill 文档

PyTorch

Deep learning framework for research and production.

When to Use

  • Deep learning research
  • Custom neural network architectures
  • GPU-accelerated training
  • Model prototyping

Quick Start

import torch
import torch.nn as nn

# Simple model
model = nn.Sequential(
    nn.Linear(784, 256),
    nn.ReLU(),
    nn.Linear(256, 10)
)

x = torch.randn(32, 784)  # Batch of 32
output = model(x)

Core Concepts

Tensors & Autograd

import torch

# Create tensors
x = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)
y = torch.randn(3, 4, device='cuda')  # GPU tensor

# Operations
z = x @ torch.randn(3, 4)  # Matrix multiply
z = torch.softmax(z, dim=-1)

# Autograd
loss = z.sum()
loss.backward()
print(x.grad)  # Gradients

Custom Modules

class TransformerBlock(nn.Module):
    def __init__(self, d_model: int, n_heads: int):
        super().__init__()
        self.attention = nn.MultiheadAttention(d_model, n_heads)
        self.norm1 = nn.LayerNorm(d_model)
        self.ff = nn.Sequential(
            nn.Linear(d_model, d_model * 4),
            nn.GELU(),
            nn.Linear(d_model * 4, d_model)
        )
        self.norm2 = nn.LayerNorm(d_model)

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        attn_out, _ = self.attention(x, x, x)
        x = self.norm1(x + attn_out)
        x = self.norm2(x + self.ff(x))
        return x

Common Patterns

Training Loop

model = MyModel().to('cuda')
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-4)
criterion = nn.CrossEntropyLoss()

for epoch in range(epochs):
    model.train()
    for batch in dataloader:
        inputs, targets = batch
        inputs, targets = inputs.to('cuda'), targets.to('cuda')

        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, targets)
        loss.backward()
        optimizer.step()

# Save model
torch.save(model.state_dict(), 'model.pt')

Best Practices

Do:

  • Use torch.no_grad() for inference
  • Move data to GPU efficiently
  • Use mixed precision training
  • Profile with torch.profiler

Don’t:

  • Forget to call model.eval() for inference
  • Skip gradient zeroing
  • Create tensors in loops
  • Ignore CUDA memory management

Troubleshooting

Issue Cause Solution
CUDA OOM Memory exhausted Reduce batch size
NaN loss Gradient explosion Lower learning rate
Slow training CPU bottleneck Use DataLoader workers

References