chroma

📁 chroma-core/agent-skills 📅 Jan 21, 2026

总安装量

周安装量

#7671

全站排名

安装命令

npx skills add https://github.com/chroma-core/agent-skills --skill chroma

Agent 安装分布

codex 19

claude-code 17

opencode 15

gemini-cli 15

github-copilot 11

amp 9

Skill 文档

Instructions

Before writing any code, gather this information:

Deployment target: Local Chroma or Chroma Cloud?
- If Cloud: they’ll need API key, tenant, and database configured
- If Local: determine if they need persistence or ephemeral storage
Search type (Cloud only): Dense only, or hybrid search?
- Dense only: simpler setup, good for most semantic search
- Hybrid (dense + sparse): better for keyword-heavy queries, use SPLADE
Embedding model: Which provider/model?
- Default: @chroma-core/default-embed (TypeScript) or built-in (Python)
- OpenAI: text-embedding-3-large is most popular, requires @chroma-core/openai
- Ask the user if they have a preference or existing provider
Data structure: What are they indexing?
- Needed to determine chunking strategy
- Needed to design metadata schema for filtering

Decision workflow

User wants to add search
Ask Local Chroma or Chroma Cloud?
- Local Chroma
  - Use collection.query() with a dense embedding model
- Chroma Cloud
  - Ask if hybrid search is needed
    - Yes
      - Use Schema() + Search() APIs with SPLADE sparse index
    - No
      - Use collection.query() with a dense embedding model
Ask for which embedding model
Design metadata schema
Implement data sync strategy

When to ask questions vs proceed

Ask first:

Embedding model choice (cost and quality implications)
Cloud vs local deployment
Hybrid vs dense-only search
Multi-tenant data isolation strategy

Proceed with sensible defaults:

Use getOrCreateCollection() / get_or_create_collection()
Use cosine similarity (most common)
Chunk size under 8KB
Store source IDs in metadata for updates/deletes

What to validate

Environment variables are set for Cloud deployments
Correct client import (CloudClient vs Client)
Embedding function package is installed (TypeScript)
Schema and Search APIs only used with Cloud
Important: get_or_create_collection() accepts either an embedding_function OR a schema, but not both. Use Schema when you need multiple indexes (hybrid search) or sparse embeddings; use embedding_function for simple dense-only search.

Quick Start

Chroma Cloud Setup (CLI)

To get started with Chroma Cloud, use the CLI to log in, create a database, and write your credentials to a .env file:

chroma login
chroma db create <my_database_name>
chroma db connect <my_database_name> --env-file

This writes a .env file with CHROMA_API_KEY, CHROMA_TENANT, and CHROMA_DATABASE to the current directory. The code examples below read from these environment variables.

TypeScript (Chroma Cloud):

import { CloudClient } from 'chromadb';
import { DefaultEmbeddingFunction } from '@chroma-core/default-embed';

const client = new CloudClient({
  apiKey: process.env.CHROMA_API_KEY,
  tenant: process.env.CHROMA_TENANT,
  database: process.env.CHROMA_DATABASE,
});

const embeddingFunction = new DefaultEmbeddingFunction();
const collection = await client.getOrCreateCollection({
  name: 'my_collection',
  embeddingFunction,
});

// Add documents
await collection.add({
  ids: ['doc1', 'doc2'],
  documents: ['First document text', 'Second document text'],
});

// Query
const results = await collection.query({
  queryTexts: ['search query'],
  nResults: 5,
});

Python (Chroma Cloud):

import os
import chromadb

client = chromadb.CloudClient(
    api_key=os.environ["CHROMA_API_KEY"],
    tenant=os.environ["CHROMA_TENANT"],
    database=os.environ["CHROMA_DATABASE"],
)

collection = client.get_or_create_collection(name="my_collection")

# Add documents
collection.add(
    ids=["doc1", "doc2"],
    documents=["First document text", "Second document text"],
)

# Query
results = collection.query(
    query_texts=["search query"],
    n_results=5,
)

Understanding Chroma

Chroma is a database. A Chroma database contains collections. A collection contains documents.

Unlike tables in a relational database, collections are created and destroyed at the application level. Each Chroma database can have millions of collections. There may be a collection for each user, or team or organization. Rather than tables be partitioned by some key, the partition in Chroma is the collection.

Collections don’t have rows, they have documents, the document is the text data that is to be searched. When data is created or updated, the client will create an embedding of the data. This is done on the client side based on the embedding function(s) provided to the client. To create the embedding the client will use its configuration to call out to the defined embedding model provider via the embedding function. This could happen in process, but overwhelmingly happens on a third party service over HTTP.

There are ways to further partition or filtering data with document metadata. Each document has a key/value object of metadata. keys are strings and values can be strings, ints or booleans. There are a variety of operators on the metadata.

During query time, the query text is embedded using the collection’s defined embedding function and then is sent to Chroma with the rest of the query parameters. Chroma will then consider any query parameters like metadata filters to reduce the potential result set, then search for the nearest neighbors using a distance algorithm between the query vector and the index of vectors in the collection that is being queried.

Working with collections is made easy by using the get_or_create_collection() (getOrCreateCollection() in TypeScript) on the Chroma client, preventing annoying boilerplate code.

Local vs Cloud

Chroma can be run locally as a process or can be used in the cloud with Chroma Cloud.

Everything that can be done locally can be done in the cloud, but not everything that can be done in the cloud can be done locally.

The biggest difference to the developer experience is the Schema() and Search() APIs, those are only available on Chroma Cloud.

Otherwise, the only thing that needs to change is the client that is imported from the Chroma package, the interface is the same.

If you’re using cloud, you probably want to use the Schema() and Search() APIs.

Also, if the user wants to use cloud, ask them what type of search they want to use. Just dense embeddings, or hybrid. If hybrid, you probably want to use SPLADE as the sparse embedding strategy.

Embeddings

When working with embedding functions, the default embedding function is available, but it’s often not the best option. The recommended option is to use Chroma Cloud Qwen. Typescript: npm install @chroma-core/chroma-cloud-qwen, python, included but needs pip install httpx.

In typescript, you need to install a package for each embedding function, install the correct one based on what the user says.

Note that Chroma has server side embedding support for SPLADE and Qwen (via @chroma-core/chroma-cloud-qwen in typescript), all other embedding functions would be external.

Learn More

If you need more detailed information about Chroma beyond what’s covered in this skill, fetch Chroma’s llms.txt for comprehensive documentation: https://docs.trychroma.com/llms.txt

Available Topics

Typescript

Chroma Regex Filtering – Learn how to use regex filters in Chroma queries
Query and Get – Query and Get Data from Chroma Collections
Schema – Schema() configures collections with multiple indexes
Updating and Deleting – Update existing documents and delete data from collections
Error Handling – Handling errors and failures when working with Chroma
Local Chroma – How to run and use local chroma
Search() API – An expressive and flexible API for doing dense and sparse vector search on collections, as well as hybrid search

Python

Chroma Regex Filtering – Learn how to use regex filters in Chroma queries
Query and Get – Query and Get Data from Chroma Collections
Schema – Schema() configures collections with multiple indexes
Updating and Deleting – Update existing documents and delete data from collections
Error Handling – Handling errors and failures when working with Chroma
Local Chroma – How to run and use local chroma
Search() API – An expressive and flexible API for doing dense and sparse vector search on collections, as well as hybrid search

General

Chroma CLI – Getting started with the Chroma CLI for cloud database management
Data Model – An overview of how Chroma stores data
Integrating Chroma into an existing system – Guidance for adding Chroma search to an existing application

GitHub 仓库 ↗ ← 返回陌讯 Skills 聚合平台