pytorch-core

📁 cuba6112/skillfactory 📅 4 days ago
1
总安装量
1
周安装量
#52762
全站排名
安装命令
npx skills add https://github.com/cuba6112/skillfactory --skill pytorch-core

Agent 安装分布

mcpjam 1
claude-code 1
junie 1
windsurf 1
zencoder 1
crush 1

Skill 文档

Overview

Core PyTorch provides the fundamental building blocks for deep learning, focusing on tensor computation with strong GPU acceleration and a deep-learning-oriented autograd system. It emphasizes a “define-by-run” approach where models are standard Python objects.

When to Use

Use PyTorch Core when you need granular control over model architecture, custom training loops, or specific hardware optimizations like pinned memory for data transfers.

Decision Tree

  1. Do you know the input dimensions of your data?
    • YES: Use standard layers (e.g., nn.Linear).
    • NO: Use Lazy modules (e.g., nn.LazyLinear) to defer initialization.
  2. Is your bottleneck data transfer to the GPU?
    • YES: Enable pin_memory=True in your DataLoader.
    • NO: Standard data loading suffices.
  3. Are you fine-tuning a model?
    • YES: Set requires_grad=False for frozen parameters.
    • NO: Keep requires_grad=True for full training.

Workflows

  1. Standard Training Iteration

    1. Load a batch of data from the DataLoader.
    2. Zero the gradients using optimizer.zero_grad().
    3. Perform a forward pass through the nn.Module.
    4. Compute the loss using a criterion (e.g., nn.CrossEntropyLoss).
    5. Execute a backward pass with loss.backward() to compute gradients.
    6. Update model parameters using optimizer.step().
  2. Model Persistence and Checkpointing

    1. Capture the state of the model and optimizer using .state_dict().
    2. Save the dictionaries to a file using torch.save().
    3. Restore the model by instantiating the class and calling .load_state_dict().
    4. Ensure .eval() is called before inference to handle Dropout and BatchNorm correctly.
  3. Deferred Architecture Initialization

    1. Define the model using Lazy modules (e.g., nn.LazyLinear).
    2. Initialize the model on the desired device.
    3. Run a dummy input or the first real batch through the model.
    4. PyTorch automatically infers and sets the weight shapes based on the input.

Non-Obvious Insights

  • Lazy Initialization: Using LazyLinear or LazyConv2d simplifies architecture definitions where input dimensions are unknown, preventing manual shape calculation errors.
  • Data Transfer Optimization: Using pin_memory() in DataLoaders is a critical optimization for faster data transfer between CPU and GPU.
  • Dynamic Gradient Control: The requires_grad attribute can be toggled on-the-fly to freeze parameters during fine-tuning or transfer learning without re-instantiating the model.

Evidence

Scripts

  • scripts/pytorch-core_tool.py: Provides a standard training loop skeleton and lazy initialization examples.
  • scripts/pytorch-core_tool.js: Node.js wrapper for invoking PyTorch training scripts.

Dependencies

  • torch
  • torchvision (optional for datasets)
  • numpy

References