hex-gemini-bq-patterns
npx skills add https://github.com/4l3k51/hex-gemini-bq-patterns --skill hex-gemini-bq-patterns
Agent 安装分布
Skill 文档
Hex.tech Patterns
Project Secrets and Variables
Hex injects project secrets directly into the notebook kernel namespace as bare Python variables. They are set in the Hex UI sidebar under “Secrets” (not “Environment variables”) and appear in the Variable explorer at runtime. They are NOT environment variables.
# Correct â bare variable, set in Hex Variables panel
response = call_api(gemini_vertex_api_key, prompt)
conversation = get_data(thread_url)
NEVER do any of these:
api_key = os.environ["GEMINI_API_KEY"] # WRONG
api_key = os.getenv("GEMINI_API_KEY") # WRONG
api_key = "AIzaSy..." # WRONG
from dotenv import load_dotenv; load_dotenv() # WRONG
api_key = globals()["gemini_api_key"] # WRONG
Data Connections
import hextoolkit as htk
connection = htk.get_data_connection("your-connection-name")
df = connection.query("SELECT * FROM table LIMIT 10")
Do not use SQLAlchemy engines, BigQuery client libraries, or raw connection strings.
LLM API Calls â Gemini / Vertex AI
Use raw REST via requests. Do NOT use provider SDKs (google.cloud.aiplatform, vertexai, etc.).
Generation
import requests
url = (
f"https://{VERTEX_CONFIG['location']}-aiplatform.googleapis.com/v1/"
f"projects/{VERTEX_CONFIG['project_id']}/locations/{VERTEX_CONFIG['location']}/"
f"publishers/google/models/{VERTEX_CONFIG['generate_model']}:generateContent"
)
payload = {
"contents": [{"role": "user", "parts": [{"text": prompt}]}]
}
response = requests.post(
url, params={"key": gemini_vertex_api_key}, json=payload,
headers={"Content-Type": "application/json"}
)
response.raise_for_status()
text = response.json()["candidates"][0]["content"]["parts"][0]["text"]
Embeddings
url = (
f"https://{VERTEX_CONFIG['location']}-aiplatform.googleapis.com/v1/"
f"projects/{VERTEX_CONFIG['project_id']}/locations/{VERTEX_CONFIG['location']}/"
f"publishers/google/models/{VERTEX_CONFIG['embed_model']}:predict"
)
payload = {"instances": [{"content": t} for t in texts]}
response = requests.post(
url, params={"key": gemini_vertex_api_key}, json=payload,
headers={"Content-Type": "application/json"}
)
response.raise_for_status()
embeddings = [p["embeddings"]["values"] for p in response.json()["predictions"]]
SQL Queries in Python
Execute SQL via hextoolkit connection, not via SQLAlchemy, pandas.read_sql(), or BigQuery client libraries.
Pure SQL filters (no Python variables)
connection = htk.get_data_connection("supabase-etl-prod-eu")
query = """
SELECT
story_id, title, url, hn_url, comments_text, comment_count
FROM `supabase-etl-prod-eu.raw_aleksi.supabase_hn_threads`
WHERE week_start = DATE_SUB(DATE_TRUNC(CURRENT_DATE(), WEEK(MONDAY)), INTERVAL 1 WEEK)
"""
df = connection.query(query)
Dynamic filters via f-string
connection = htk.get_data_connection("supabase-etl-prod-eu")
query = f"""
WITH target_thread AS (
SELECT DISTINCT title
FROM `supabase-etl-prod-eu.dbt.commonroom_webhook`
WHERE {activity_filter}
AND external_activity_url LIKE '{safe_url}%'
)
SELECT
m.title,
m.contact_full_name,
m.timestamp,
m.content
FROM `supabase-etl-prod-eu.dbt.commonroom_webhook` m
INNER JOIN target_thread t ON m.title = t.title
WHERE {activity_filter}
ORDER BY m.timestamp
"""
df = connection.query(query)
Retry Pattern
def call_gemini_with_retries(gemini_vertex_api_key, prompt, max_retries=3):
for attempt in range(max_retries):
try:
url = f"https://{VERTEX_CONFIG['location']}-aiplatform.googleapis.com/v1/projects/{VERTEX_CONFIG['project_id']}/locations/{VERTEX_CONFIG['location']}/publishers/google/models/{VERTEX_CONFIG['generate_model']}:generateContent"
payload = {"contents": [{"role": "user", "parts": [{"text": prompt}]}]}
params = {"key": gemini_vertex_api_key}
headers = {"Content-Type": "application/json"}
response = requests.post(url, params=params, json=payload, headers=headers)
response.raise_for_status()
return response.json()["candidates"][0]["content"]["parts"][0]["text"]
except Exception as e:
print(f"Attempt {attempt + 1} failed: {e}")
if attempt == max_retries - 1:
raise e
time.sleep(2 ** attempt)
return None
Cell Timing
Add time.sleep(2) at the top of cells that depend on prior cell outputs to ensure previous cells have finished executing:
import time
time.sleep(2)
# rest of cell logic that depends on variables from prior cells
Cross-Cell Variable References
Variables defined in any cell (including Hex query cells like dataframe_52) are available in all subsequent cells. Reference them directly:
if dataframe_52 is None or dataframe_52.empty:
print("ERROR: No data")
elif not gemini_vertex_api_key:
print("ERROR: gemini_vertex_api_key not set")
else:
result = analyze(gemini_vertex_api_key, dataframe_52)