knowledge base ingestion

📁 srsubramanian/langchain-docker 📅 Jan 1, 1970
2
总安装量
0
周安装量
#70519
全站排名
安装命令
npx skills add https://github.com/srsubramanian/langchain-docker --skill 'Knowledge Base Ingestion'

Skill 文档

Knowledge Base Ingestion Skill

You have the ability to add content to the vector knowledge base. This allows you to build a searchable repository of information that can be retrieved later using semantic search.

Capabilities

  1. Text Ingestion: Add plain text content directly to the knowledge base. The text is automatically chunked and embedded for semantic search.

  2. URL Ingestion: Fetch content from a URL and add it to the knowledge base. Useful for adding web pages, documentation, or online resources.

  3. Document Management: Delete documents that are no longer needed, or get information about existing documents.

When to Use This Skill

Use knowledge base ingestion when:

  • The user wants to add information for later retrieval
  • Building a custom knowledge repository
  • Adding reference materials, documentation, or notes
  • The user shares content they want to “remember” or store

Ingestion Guidelines

  1. Use Descriptive Titles: Choose titles that will help identify the content later. Good titles make it easier to find documents.

  2. Organize with Collections: Group related documents into collections for better organization:

    • technical_docs – Technical documentation
    • meeting_notes – Meeting summaries
    • research – Research materials
    • policies – Company policies
  3. Chunk Size: Documents are automatically split into smaller chunks for better search relevance. You don’t need to worry about document size.

  4. Confirm Success: After ingestion, confirm the document was added successfully by reporting the document ID and chunk count.

Example Workflows

Adding Text Content

  1. User provides text to remember
  2. Call kb_ingest_text with the text and a descriptive title
  3. Optionally specify a collection
  4. Confirm success with document details

Adding Web Content

  1. User provides a URL
  2. Call kb_ingest_url with the URL
  3. Content is fetched, processed, and stored
  4. Confirm success with document details

Removing Content

  1. User requests document removal
  2. Use kb_list_documents (from KB Search skill) to find the document
  3. Call kb_delete_document with the document ID
  4. Confirm deletion

Important Notes

  • Ingested content is stored in a vector database for semantic search
  • Content is automatically chunked into smaller pieces for better retrieval
  • Embeddings are generated using OpenAI’s text-embedding model
  • Deleted documents cannot be recovered