numerai-model-implementation
4
总安装量
2
周安装量
#53729
全站排名
安装命令
npx skills add https://github.com/numerai/example-scripts --skill numerai-model-implementation
Agent 安装分布
cursor
2
claude-code
2
codex
1
Skill 文档
Numerai Model Implementation
Overview
Add a new model type so it can be selected in configs and trained/evaluated by the base pipeline.
Note: run commands from numerai/ (so agents is importable), or from repo root with PYTHONPATH=numerai.
Implement a New Model Type
-
Define the model API and output shape.
- Implement
fit(X, y, sample_weight=...)andpredict(X). - Put custom wrappers in
agents/code/modeling/models/so model-specific code stays isolated. - Accept pandas DataFrames or convert to NumPy inside the model wrapper.
- Implement
-
Register the model constructor in
agents/code/modeling/utils/model_factory.py.- Use lazy imports so optional dependencies do not break other workflows.
- Raise a clear ImportError when the dependency is missing.
if model_type == "XGBRegressor":
try:
from xgboost import XGBRegressor
except ImportError as exc:
raise ImportError(
"xgboost is required for XGBRegressor. Install with `.venv/bin/pip install xgboost`."
) from exc
return XGBRegressor(**model_params)
- Add or update a config to use the new model type.
CONFIG = {
"model": {"type": "XGBRegressor", "params": {"n_estimators": 500}},
"training": {"cv": {"n_splits": 5}},
"data": {"data_version": "v5.2", "feature_set": "small", "target_col": "target", "era_col": "era"},
"output": {},
"preprocessing": {},
}
- Add extra data columns if the model needs them.
- Update
load_and_prepare_datainagents/code/modeling/utils/pipeline.pyto pass extra columns intoload_full_data. - Add corresponding config entries so experiments stay reproducible.
- Update
Validate
- Run a smoke test:
.venv/bin/python -m agents.code.modeling --config <config_path>. - Run metrics on the smoke test and make sure corr_mean is > 0.005 and < 0.04. If it’s less then something is probably fundamentally wrong. If it’s higher than there is likely leakage and you need to find the problem.
- Double check that any early stopping mechanisms or modifications to the fit/predict loop don’t over-estimate accuracy. Accurately estimating performance is of paramount importance on Numerai because we need to be able to decide if we should stake or not.
- Run unit tests after refactors:
.venv/bin/python -m unittest.
Next Steps
After validating the model implementation:
- Use the
numerai-experiment-designskill to run multiple rounds of experiments (4â5 configs per round), then scale winners until you hit a plateau. - Use the
numerai-model-uploadskill to create a pkl file only after you have a stable, scaled âbest modelâ you intend to deploy. - Deploy to Numerai using the MCP server (see
numerai-model-uploadskill for deployment workflow).