together-embeddings

📁 zainhas/togetherai-skills 📅 1 day ago
2
总安装量
1
周安装量
#72602
全站排名
安装命令
npx skills add https://github.com/zainhas/togetherai-skills --skill together-embeddings

Agent 安装分布

amp 1
cline 1
opencode 1
cursor 1
kimi-cli 1
codex 1

Skill 文档

Together Embeddings & Reranking

Overview

Generate vector embeddings for text and rerank documents by relevance.

  • Embeddings endpoint: /v1/embeddings
  • Rerank endpoint: /v1/rerank

Embeddings

Generate Embeddings

from together import Together
client = Together()

response = client.embeddings.create(
    model="BAAI/bge-large-en-v1.5",
    input="What is the meaning of life?",
)
print(response.data[0].embedding[:5])  # First 5 dimensions
import Together from "together-ai";
const together = new Together();

const response = await together.embeddings.create({
  model: "BAAI/bge-large-en-v1.5",
  input: "What is the meaning of life?",
});
console.log(response.data[0].embedding.slice(0, 5));
curl -X POST "https://api.together.xyz/v1/embeddings" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"BAAI/bge-large-en-v1.5","input":"What is the meaning of life?"}'

Batch Embeddings

texts = ["First document", "Second document", "Third document"]
response = client.embeddings.create(
    model="BAAI/bge-large-en-v1.5",
    input=texts,
)
for i, item in enumerate(response.data):
    print(f"Text {i}: {len(item.embedding)} dimensions")
import Together from "together-ai";
const together = new Together();

const response = await together.embeddings.create({
  model: "BAAI/bge-large-en-v1.5",
  input: [
    "First document",
    "Second document",
    "Third document",
  ],
});
for (const item of response.data) {
  console.log(`Index ${item.index}: ${item.embedding.length} dimensions`);
}
curl -X POST "https://api.together.xyz/v1/embeddings" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "BAAI/bge-large-en-v1.5",
    "input": [
      "First document",
      "Second document",
      "Third document"
    ]
  }'

Embedding Models

Model API String Dimensions Max Input
BGE Large EN v1.5 BAAI/bge-large-en-v1.5 1024 512 tokens
BGE Base EN v1.5 BAAI/bge-base-en-v1.5 768 512 tokens
E5 Mistral 7B intfloat/e5-mistral-7b-instruct 4096 32768 tokens
GTE Large thenlper/gte-large 1024 512 tokens
UAE Large v1 WhereIsAI/UAE-Large-V1 1024 512 tokens
M2 BERT 80M togethercomputer/m2-bert-80M-8k-retrieval 768 8192 tokens
M2 BERT 32K togethercomputer/m2-bert-80M-32k-retrieval 768 32768 tokens

Reranking

Rerank a set of documents by relevance to a query:

response = client.rerank.create(
    model="Salesforce/Llama-Rank-V1",
    query="What is the capital of France?",
    documents=[
        "Paris is the capital of France.",
        "Berlin is the capital of Germany.",
        "London is the capital of England.",
        "The Eiffel Tower is in Paris.",
    ],
)
for result in response.results:
    print(f"Index: {result.index}, Score: {result.relevance_score:.4f}")
import Together from "together-ai";
const together = new Together();

const documents = [
  "Paris is the capital of France.",
  "Berlin is the capital of Germany.",
  "London is the capital of England.",
  "The Eiffel Tower is in Paris.",
];

const response = await together.rerank.create({
  model: "Salesforce/Llama-Rank-V1",
  query: "What is the capital of France?",
  documents,
  top_n: 2,
});

for (const result of response.results) {
  console.log(`Index: ${result.index}, Score: ${result.relevance_score}`);
}
curl -X POST "https://api.together.xyz/v1/rerank" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Salesforce/Llama-Rank-V1",
    "query": "What is the capital of France?",
    "documents": ["Paris is the capital of France.", "Berlin is the capital of Germany."]
  }'

Rerank Parameters

Parameter Type Description
model string Rerank model (required)
query string Search query (required)
documents string[] Documents to rerank (required)
top_n int Return top N results
return_documents bool Include document text in response

RAG Pipeline Pattern

# 1. Generate query embedding
query_embedding = client.embeddings.create(
    model="BAAI/bge-large-en-v1.5",
    input="How does photosynthesis work?",
).data[0].embedding

# 2. Retrieve candidates from vector DB (your code)
candidates = vector_db.search(query_embedding, top_k=20)

# 3. Rerank for precision
reranked = client.rerank.create(
    model="Salesforce/Llama-Rank-V1",
    query="How does photosynthesis work?",
    documents=[c.text for c in candidates],
    top_n=5,
)

# 4. Use top results as context for LLM
context = "\n".join([candidates[r.index].text for r in reranked.results])
response = client.chat.completions.create(
    model="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
    messages=[
        {"role": "system", "content": f"Answer based on this context:\n{context}"},
        {"role": "user", "content": "How does photosynthesis work?"},
    ],
)

Resources