wasde-ingestor
npx skills add https://github.com/fatfingererr/macro-skills --skill wasde-ingestor
Agent 安装分布
Skill 文档
<essential_principles> WASDE Ingestor æ ¸å¿åå
1. ååç¯å
WASDE æ¶µèä¸å¤§é¡ååï¼æ¯é¡æä¸åçè¡¨æ ¼çµæ§èå®ä½ï¼
| é¡å¥ | åå | US å®ä½ | World å®ä½ |
|---|---|---|---|
| Grains | Wheat, Corn, Rice, Barley, Sorghum, Oats | million bushels / cwt | million metric tons |
| Oilseeds | Soybeans, Soybean Oil, Soybean Meal | million bushels / pounds | million metric tons |
| Cotton | Upland, Pima | million 480-lb bales | million 480-lb bales |
| Livestock | Beef, Pork, Poultry, Eggs, Dairy | million lbs / dozen / lbs | – (US only) |
| Sugar | Sugar | 1,000 short tons | – (US + Mexico only) |
2. è¡¨æ ¼çµæ§ä¸è´æ§
ææä¾é表éµå¾ªç¸åçå¹³è¡¡å ¬å¼ï¼
Ending Stocks = Beginning Stocks + Production + Imports - Total Use - Exports
3. çæ¬æ§å¶
- æ¯æå ±åå¯è½ä¿®æ£åææ¸æ
- 使ç¨
content_hashè¿½è¹¤æ¸æè®å - ä¿ç
revision_flagæ¨è¨ä¿®æ£å¼
4. è§£æåªå ç´
- PDF çµæ§åè§£æï¼åªå ï¼
- HTML è¡¨æ ¼è§£æï¼fallbackï¼
- TXT åºå®å¯¬åº¦è§£æï¼legacyï¼ </essential_principles>
- Ingest – æåä¸¦è§£æææ°ææå®æä»½ç WASDE å ±å
- Backfill – æ¹éåè£æ·å² WASDE æ¸æ
- Validate – é©èç¾ææ¸æèææ°å ±åçä¸è´æ§
- Configure – é ç½®ååç¯åã輸åºè·¯å¾ç忏
çå¾ åæå¾åç¹¼çºã
è®å工使µç¨å¾ï¼è«å®å ¨éµå¾ªå ¶æ¥é©ã
<reference_index>
åååèæä»¶ (references/)
| æä»¶ | å §å®¹ |
|---|---|
| commodities-overview.md | ææåå總覽èå®ä½å°ç § |
| grains.md | ç©ç©é¡ï¼Wheat, Corn, Rice, Barley, Sorghum, Oatsï¼ |
| oilseeds.md | 油籽é¡ï¼Soybeans, Soybean Oil, Soybean Mealï¼ |
| cotton.md | æ£è±ï¼US & Worldï¼ |
| livestock.md | çç¢åï¼Beef, Pork, Poultry, Eggs, Dairyï¼ |
| sugar.md | ç³ï¼US & Mexicoï¼ |
| validation-rules.md | é©èè¦åèåçæ§æª¢æ¥ |
| failure-modes.md | å¤±ææ¨¡å¼èç·©è§£çç¥ |
| </reference_index> |
<workflows_index>
| Workflow | Purpose |
|---|---|
| ingest.md | æåè§£æå®ä¸æå¤å WASDE å ±å |
| backfill.md | æ¹éåè£æ·å²æ¸æ |
| validate.md | é©èæ¸æä¸è´æ§èä¿®æ£è¿½è¹¤ |
| configure.md | é ç½®ååç¯åè輸åºè¨å® |
| </workflows_index> |
<templates_index>
| Template | Purpose |
|---|---|
| config.yaml | æ¨æºé ç½®æä»¶æ¨¡æ¿ |
| us-balance-schema.yaml | US ä¾é表 schema |
| world-balance-schema.yaml | World ä¾é表 schema |
| </templates_index> |
<scripts_index>
| Script | Purpose |
|---|---|
| discover_releases.py | ç¼ç¾ WASDE ç¼å¸æ¥æè URL |
| fetch_report.py | ä¸è¼ PDF/HTML å ±å |
| parse_tables.py | è§£æè¡¨æ ¼æåæ¸æ |
| validate_data.py | å·è¡é©èæª¢æ¥ |
| </scripts_index> |
<examples_index>
ç¯ä¾è¼¸åº (examples/)
| æä»¶ | å §å®¹ |
|---|---|
| corn_us_balance.json | ç¾åçç±³ä¾é平衡表ç¯ä¾ |
| wheat_world_balance.json | ä¸çå°éº¥ä¾é平衡表ç¯ä¾ |
| ingest_status_success.json | æåå·è¡ççæ å ±åç¯ä¾ |
| backfill_result.json | æ¹éåè£ä½æ¥çµæç¯ä¾ |
| </examples_index> |
<quick_start> CLI å¿«ééå§ï¼
# æåææ°å ±åçææåå
macro-skills wasde ingest --commodities=all --out=./data
# åªæåç©ç©é¡
macro-skills wasde ingest --commodities=grains --out=./data
# åè£éå» 24 åæçå¤§è±æ¸æ
macro-skills wasde backfill --commodity=soybeans --months=24 --out=./data
# é©èç¾ææ¸æ
macro-skills wasde validate --data-dir=./data
Library å¿«ééå§ï¼
from macro_skills.wasde import WASDEIngestor
ingestor = WASDEIngestor(
commodities=["wheat", "corn", "soybeans"],
scope=["us", "world"],
output_dir="./data/wasde"
)
result = ingestor.ingest_latest()
print(result.status)
</quick_start>
<success_criteria> Skill æåå·è¡æï¼
- æ£ç¢ºèå¥ååé¡å¥èå°æè¡¨æ ¼
- è§£ææææå®ååçä¾éæ¸æ
- å®ä½æ£ç¢ºè½æèæ¨è¨
- ééå¹³è¡¡å ¬å¼é©è
- è¼¸åºæ¨æºå parquet æ¸æé
- çæ ingest_status å ±å </success_criteria>
<detailed_usage_examples> 詳細使ç¨ç¯ä¾
ç¯ä¾ 1ï¼æåææ°çç±³èå¤§è±æ¸æ
User: 幫ææåææ° WASDE ççç±³å大è±ä¾éæ¸æ
Assistant å·è¡æ¥é©:
1. è®å workflows/ingest.md
2. å¼å« discover_releases.py æ¾å°ææ°ç¼å¸ (2025-01-10)
3. ä¸è¼ PDF å ±å
4. è§£æ corn_us, corn_world, soybeans_us, soybeans_world è¡¨æ ¼
5. é©è平衡å
¬å¼
6. è¼¸åº parquet æªæ¡
7. åå ± ingest_status
é æè¼¸åº:
- ./data/wasde/curated/corn_us_balance.parquet
- ./data/wasde/curated/corn_world_balance.parquet
- ./data/wasde/curated/soybeans_us_balance.parquet
- ./data/wasde/curated/soybeans_world_balance.parquet
ç¯ä¾ 2ï¼åè£æ·å²å°éº¥æ¸æä¸¦è¿½è¹¤ä¿®æ£
User: åè£éå» 12 åæçå°éº¥æ¸æï¼ææ³è¿½è¹¤ USDA çä¿®æ£æ·ç¨
Assistant å·è¡æ¥é©:
1. è®å workflows/backfill.md
2. ç¢ºå®æ¥æç¯å (2024-01 è³ 2024-12)
3. ä¾åºèç 12 æå ±å
4. æ¯å°æ¯ææ¸æï¼æ¨è¨ä¿®æ£å¼
5. 輸åºå
å« revision_flag ç宿´æ·å²æ¸æ
é æè¼¸åº:
- 12 æå°éº¥æ¸æ
- ä¿®æ£è¿½è¹¤å ±å
ç¯ä¾ 3ï¼é©èç¾ææ¸æä¸è´æ§
User: æª¢æ¥æç¾æç WASDE æ¸ææ¯å¦èææ°å ±åä¸è´
Assistant å·è¡æ¥é©:
1. è®å workflows/validate.md
2. è¼å
¥ç¾æ parquet æªæ¡
3. æåææ°å ±å
4. 鿬使¯å°
5. æ¨è¨å·®ç°èä¿®æ£
6. 輸åºé©èå ±å
é æè¼¸åº:
- validation_report.json (å·®ç°å表)
- ä¿®æ£å»ºè°
</detailed_usage_examples>
<data_pipeline_architecture> æ¸ææµæ°´ç·æ¶æ§
[USDA WASDE Website]
|
v
+-------------------+
| discover_releases | --> æ¾å° PDF URL èç¼å¸æ¥æ
+-------------------+
|
v
+-------------------+
| fetch_report | --> ä¸è¼ PDF (fallback: HTML)
+-------------------+
|
v
+-------------------+
| parse_tables | --> æåä¾éè¡¨æ ¼æ¸æ
+-------------------+
|
v
+-------------------+
| normalize_data | --> æ¨æºåæ¬ä½èå®ä½
+-------------------+
|
v
+-------------------+
| validate_data | --> 平衡å
¬å¼èç¯å檢æ¥
+-------------------+
|
v
+-------------------+
| write_output | --> Parquet / JSON
+-------------------+
|
v
[Curated Dataset]
ç®éçµæ§ï¼
data/wasde/
âââ raw/ # åå§ PDF æªæ¡
â âââ 2025-01-10/
â âââ wasde0125.pdf
âââ intermediate/ # ä¸éè§£æçµæ
â âââ 2025-01-10/
â âââ tables.json
â âââ ingest_status.json
âââ curated/ # æçµæ¨æºåæ¸æ
âââ corn_us_balance.parquet
âââ corn_world_balance.parquet
âââ wheat_us_balance.parquet
âââ ...
</data_pipeline_architecture>