numerai-model-upload
npx skills add https://github.com/numerai/example-scripts --skill numerai-model-upload
Agent 安装分布
Skill 文档
Numerai Model Upload
Overview
Create a portable predict(live_features, live_benchmark_models) pickle that runs inside Numerai’s numerai_predict container without repo dependencies.
CRITICAL: Python Version Compatibility
Before creating any pkl file, you must ensure your Python environment matches Numerai’s compute environment. Mismatched versions cause segfaults and validation failures due to binary incompatibility (especially with numpy).
Step 1: Query the Default Docker Image (MCP Required)
If the numerai MCP server is available, always query the default Python version first:
query { computePickleDockerImages { id name image tag default } }
Look for the entry with default: true. The image name indicates the Python version:
numerai_predict_py_3_12:a78deddâ Python 3.12 (current default as of 2026)numerai_predict_py_3_11:a78deddâ Python 3.11numerai_predict_py_3_10:a78deddâ Python 3.10
If the numerai MCP is not installed, it can be installed through our install script via curl -sL https://numer.ai/install-mcp.sh | bash, this script guides the user through installing the MCP for Codex CLI and configuring an API key with the correct scopes that are required by MCP.
You can find more documentation about Numerai MCP here: https://docs.numer.ai/numerai-tournament/mcp
Step 2: Create Matching Virtual Environment with pyenv
Use pyenv to create a virtual environment with the exact Python version:
# 1. List available pyenv Python versions
ls ~/.pyenv/versions/
# 2. Find the matching minor version (e.g., for Python 3.12)
PYENV_PY=$(ls -d ~/.pyenv/versions/3.12.* 2>/dev/null | head -1)
# 3. Create the virtual environment
$PYENV_PY/bin/python -m venv ./venv
# 4. Activate and install pkl dependencies
source ./venv/bin/activate
pip install --upgrade pip
pip install numpy pandas cloudpickle scipy
# Add lightgbm, torch, etc. only if your model needs them
Step 3: Create pkl in the Correct Environment
Always create pkl files using the matching venv:
./venv/bin/python create_model_pkl.py
Requirements
- Implement
predict(live_features, live_benchmark_models)and return a DataFrame with apredictioncolumn aligned to the input index. - Preserve training-time preprocessing (feature order, imputation values, scaling params) inside the pickle.
- Avoid imports from local repo modules (no
agents.*), because Numerai’s container will not have them. - Prefer numpy/pandas/scipy-only inference; do not rely on torch/xgboost unless you verify the container has those packages.
- Move any trained model to CPU before exporting and store plain numpy weights.
- Validate required columns (
erafor per-era ranking, benchmark column if used).
Workflow
- Query the default Docker image from the MCP to determine the required Python version.
- Create/activate a matching venv using pyenv (see above).
- Train on the desired full dataset (train + validation) with the same preprocessing and early-stopping scheme as the best model only after your research has plateaued and you have selected the final configuration to deploy.
- Export an inference bundle from the trained model:
- Feature list and ordering
- Imputation values and scaling stats
- Model weights/biases (numpy arrays)
- Activation name and any constants
- Benchmark column name if needed as a feature
- Build a
predictfunction that:- Reads only from the bundle and standard libraries
- Applies preprocessing and a numpy forward pass
- Ranks predictions per era to [0, 1] when required
cloudpickle.dump(predict, "model.pkl")using the matching venv’s Python.- Test the pickle with the Numerai container before uploading.
Testing
Run the Numerai debug container locally (use the same image tag as the default):
# Get the default image tag from MCP query, then test:
docker run -i --rm -v "$PWD:$PWD" ghcr.io/numerai/numerai_predict_py_3_12:a78dedd --debug --model $PWD/[PICKLE_FILE]
Common Pitfalls
- Segmentation fault / numpy binary incompatibility: The pkl was created with a different Python version than Numerai’s container. Always query the default docker image first and create pkl files using a matching pyenv-based venv.
ImportError: No module named 'agents': occurs when the pickle references repo classes. Fix by exporting a pure-numpy inference bundle and rebuildingpredictwithout repo imports.- Missing
eracolumn: per-era ranking requireslive_features["era"]. - Benchmark misalignment: ensure
live_benchmark_modelsis reindexed tolive_features(by id) before use. - Feature drift: ensure feature order in inference matches training order exactly.
Debugging Validation Failures
If your pickle fails validation, query the trigger status and logs:
query {
account {
models {
username
computePickleUpload {
filename
validationStatus
triggerStatus
triggers {
id
status
statuses {
status
description
insertedAt
}
}
}
}
}
}
Common error descriptions:
"Segmentation fault! Ensure python and library versions match our environment."â Python/numpy version mismatch"No currently open rounds!"â Model validated successfully but no round is open for submission
Reference
- Use
numerai/example_model.ipynbfor the expectedpredictsignature and output format.
Deploying to Numerai via MCP Server
After creating and testing your pkl file, you can deploy it to Numerai using the Numerai MCP server. The MCP server provides tools for creating models and uploading pkl files programmatically.
Available MCP Tools
The numerai MCP server provides these key tools:
check_api_credentials– Verify your API token and see granted scopescreate_model– Create a new model in a tournamentupload_model– Upload pkl files (multi-step workflow)graphql_query– List existing models and perform custom queries
Authentication
All authenticated operations require a Numerai API token with upload_submission scope:
- Format:
PUBLIC_ID$SECRET_KEY - Get your API key from https://numer.ai/account
Option 1: Upload to an Existing Model
If you already have a model slot you want to use:
-
List your models using
graphql_query:query { account { models { id name } } } -
Get upload authorization for your pkl file:
- Call
upload_modelwithoperation: "get_upload_auth",modelId: "<model_uuid>",filename: "model.pkl" - This returns a presigned URL for uploading
- Call
-
Upload the pkl file
- Call a PUT file upload on the pre-signed URL with the path to the pkl file.
-
Register the upload with Numerai:
- Call
upload_modelwithoperation: "create",modelId: "<model_uuid>",filename: "model.pkl" - This triggers validation of your pickle
- Call
-
Check validation status:
- Call
upload_modelwithoperation: "list"to see all pickles and their status - Wait for validation to complete successfully
- Call
-
Assign the pickle to the model slot:
- Call
upload_modelwithoperation: "assign",modelId: "<model_uuid>",pickleId: "<pickle_uuid>" - This makes the pickle active for automated submissions
- Call
Option 2: Create a New Model and Upload
If you want to create a new model slot:
-
Create the model:
- Call
create_modelwithname: "<unique_model_name>",tournament: 8(for Classic) - Note: Model names must be unique within the tournament
- Call
-
Get the model ID from the response
-
Follow steps 2-6 from Option 1 to upload and assign the pkl file
Upload Workflow Summary
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
â PKL DEPLOYMENT WORKFLOW â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ¤
â â
â 1. Create pkl file (this skill's main workflow) â
â 2. Test pkl locally with numerai_predict container â
â 3. Choose: create new model OR use existing model â
â â
â For new model: â
â ââ> create_model(name, tournament=8) â
â â
â For existing model: â
â ââ> graphql_query to list models and get model ID â
â â
â 4. upload_model(operation="get_upload_auth", modelId, filename)â
â 5. upload_model(operation="put_file", presignedUrl, localPath) â
â 6. upload_model(operation="create", modelId, filename) â
â 7. upload_model(operation="list") - wait for validation â
â 8. upload_model(operation="assign", modelId, pickleId) â
â â
â Optional: â
â - upload_model(operation="trigger", pickleId) to test â
â - upload_model(operation="get_logs", pickleId, triggerId) â
â â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
Important Notes
- Only the Classic tournament (tournament=8) supports pickle uploads
- The model must have its submission webhook disabled before uploading
- CRITICAL: Before creating pkl files, query the default docker image to ensure Python version compatibility
- Use this GraphQL query to check available runtimes and the default:
query { computePickleDockerImages { id name image tag default } } - Use
upload_model(operation="list_data_versions")to see available dataset versions - After assignment, Numerai will automatically run your pickle each round
Pre-Upload Checklist
Before uploading a pkl file, verify:
- â
Queried
computePickleDockerImagesto get the default Python version - â Created venv using pyenv with matching Python version
- â Created pkl file using the matching venv’s Python interpreter
- â Tested pkl locally with the matching docker container (optional but recommended)
Triggering and Debugging
After assigning a pickle, you can manually trigger it for testing:
-
Trigger the pickle:
- Call
upload_modelwithoperation: "trigger",pickleId: "<pickle_uuid>",triggerValidation: true
- Call
-
View execution logs:
- Call
upload_modelwithoperation: "get_logs",pickleId: "<pickle_uuid>",triggerId: "<trigger_uuid>"
- Call
Asking the User
Before deploying, confirm with the user:
- Do they want to deploy the pkl to Numerai?
- Should we create a new model or upload to an existing one?
- If new: what name should the model have?
- If existing: which model should receive the upload?
- Do they have their API token ready (or is it already configured)?