exec-remote

📁 primatrix/skills 📅 1 day ago
3
总安装量
3
周安装量
#62150
全站排名
安装命令
npx skills add https://github.com/primatrix/skills --skill exec-remote

Agent 安装分布

codex 3
opencode 2
gemini-cli 2
claude-code 2
github-copilot 2
kimi-cli 2

Skill 文档

Remote Execution Skill

This skill handles running code on remote GPU or TPU clusters via SkyPilot.

1. Determine Target Device

Identify the target device from the user’s request:

Target Cluster name file Launch script UV extra Env prefix
GPU .cluster_name_gpu launch_gpu.sh gpu export CUDA_VISIBLE_DEVICES=0;
TPU .cluster_name_tpu launch_tpu.sh tpu (none)

Execution Instructions: Before running the launch script, you must find its absolute path. It is located in the scripts/ directory alongside this skill definition. Use your file search tools (e.g., glob or find) to locate launch_gpu.sh or launch_tpu.sh before executing it.

If the user does not specify a device, ask them which one to use.

2. Prerequisites

  • The cluster must already be provisioned. Check that the corresponding cluster name file (.cluster_name_gpu or .cluster_name_tpu) exists and is non-empty in the project root.
  • If the file does not exist or is empty, ask the user to provision a cluster first using the appropriate launch script.

3. Cluster Management

Provisioning

# Note: First locate the scripts as instructed above, then run them.
# GPU — common accelerator types: H100:1, A100:1, L4:1
bash <absolute_path_to_launch_gpu.sh> <accelerator_type> <experiment_name>

# TPU — common accelerator types: tpu-v4-8, tpu-v4-16, tpu-v6e-1, tpu-v6e-4
bash <absolute_path_to_launch_tpu.sh> <accelerator_type> <experiment_name>

The launch script automatically updates the corresponding .cluster_name_* file.

Teardown

# GPU
sky down $(cat .cluster_name_gpu) -y

# TPU
sky down $(cat .cluster_name_tpu) -y

4. Execution Command

GPU

sky exec $(cat .cluster_name_gpu) --workdir . "export CUDA_VISIBLE_DEVICES=0; uv run --extra gpu python <PATH_TO_SCRIPT> [ARGS]"
  • export CUDA_VISIBLE_DEVICES=0; ensures deterministic single-GPU execution. Adjust for multi-GPU jobs.
  • --extra gpu activates GPU optional dependencies (e.g. jax[cuda]).

TPU

sky exec $(cat .cluster_name_tpu) --workdir . "uv run --extra tpu python <PATH_TO_SCRIPT> [ARGS]"
  • --extra tpu activates TPU optional dependencies (e.g. jax[tpu]).

Common flags

  • --workdir . syncs the current local directory to the remote instance before running.
  • For pytest, use python -m pytest <test_path> instead of calling pytest directly.

5. Usage Examples

Run a benchmark on GPU:

sky exec $(cat .cluster_name_gpu) --workdir . "export CUDA_VISIBLE_DEVICES=0; uv run --extra gpu python src/lynx/perf/benchmark_train.py"

Run tests on TPU:

sky exec $(cat .cluster_name_tpu) --workdir . "uv run --extra tpu python -m pytest src/lynx/test/"

6. Operational Notes

  • Logs: SkyPilot streams stdout and stderr directly to the terminal.
  • Interruption: Ctrl+C may not kill the remote process; check SkyPilot docs for cleanup if needed.