Vector Search Quick Start¶

Quick guide to set up semantic vector search in your knowledge base.

What This Gives You¶

🔍 Semantic Search: Search by meaning, not keywords
🚀 Automatic Indexing: Knowledge base is automatically vectorized
🎯 Accurate Results: Finds relevant documents even without exact word matches
🌐 Multi-language: Russian and English support

Architecture¶

Bot → MCP Hub → Infinity (embeddings) → Qdrant (vector DB)

Quick Start¶

1. Prepare Files¶

# Copy configuration examples
cp .env.example .env
cp config.example.yaml config.yaml

# Create data directories
mkdir -p data/qdrant_storage data/infinity_cache

2. Configure .env¶

Edit .env, minimum required:

# Your Telegram bot token
TELEGRAM_BOT_TOKEN=your_bot_token_here

# Allowed user IDs
ALLOWED_USER_IDS=123456789

# Embedding model (for Russian + English)
INFINITY_MODEL=BAAI/bge-m3

3. Configure config.yaml¶

Enable vector search in config.yaml:

# Enable vector search
VECTOR_SEARCH_ENABLED: true

# Embedding settings
VECTOR_EMBEDDING_PROVIDER: infinity
VECTOR_EMBEDDING_MODEL: BAAI/bge-m3
VECTOR_INFINITY_API_URL: http://infinity:7997

# Qdrant settings
VECTOR_STORE_PROVIDER: qdrant
VECTOR_QDRANT_URL: http://qdrant:6333
VECTOR_QDRANT_COLLECTION: knowledge_base

4. Start Services¶

# Start all services (includes Qdrant and Infinity)
# IMPORTANT: vLLM and SGLang both use port 8001 - comment out one of them in docker-compose.yml!
docker-compose up -d

# Watch logs (wait for model loading ~1-2 minutes)
docker-compose logs -f infinity

5. Verify Operation¶

# Check status of all services
docker-compose ps

# Should be running:
# - tg-note-qdrant (healthy)
# - tg-note-infinity (healthy)
# - tg-note-hub (healthy)
# - tg-note-bot (running)

Usage¶

Through Telegram Bot¶

Send documents to the bot (text, PDF, DOCX, etc.)
Bot automatically indexes them
Ask questions - bot will use semantic search

Example¶

👤 User: How do transformers work in NLP?

🤖 Bot: [finds relevant sections in knowledge base using vector search]

Model Selection¶

For Russian + English (recommended)¶

INFINITY_MODEL=BAAI/bge-m3

English only (faster)¶

INFINITY_MODEL=BAAI/bge-small-en-v1.5

Other Options¶

Model	Languages	Quality	Speed
`BAAI/bge-m3`	Multilingual	⭐⭐⭐⭐⭐	⚡⚡
`BAAI/bge-small-en-v1.5`	English	⭐⭐⭐⭐	⚡⚡⚡
`BAAI/bge-base-en-v1.5`	English	⭐⭐⭐⭐⭐	⚡⚡

Management¶

View Logs¶

# All logs
docker-compose logs -f

# Specific service only
docker-compose logs -f infinity

Restart¶

# Restart all
docker-compose restart

# Restart specific service
docker-compose restart infinity

Stop¶

# Stop all services
docker-compose down

# Stop and remove data (be careful!)
docker-compose down -v

Resource Requirements¶

Minimum Requirements¶

RAM: 4 GB (with bge-small model)
Disk: 5 GB (for models and data)
CPU: 2 cores

Recommended Requirements¶

RAM: 8 GB (with bge-m3 model)
Disk: 10 GB
CPU: 4 cores
GPU (optional): Speeds up processing 5-10x

GPU Acceleration (Optional)¶

To speed up processing, uncomment in docker-compose.yml (infinity section):

infinity:
  deploy:
    resources:
      reservations:
        devices:
          - driver: nvidia
            count: 1
            capabilities: [gpu]

Requirements: - NVIDIA GPU - nvidia-docker installed

Troubleshooting¶

Infinity Won't Start¶

Problem: Infinity container keeps restarting

Solution: Check logs and give time for model loading (1-2 minutes)

docker-compose logs infinity

Out of Memory¶

Problem: Out of memory

Solution: Use smaller model

INFINITY_MODEL=BAAI/bge-small-en-v1.5
INFINITY_BATCH_SIZE=16

Vector Search Not Available¶

Problem: "Vector search is not available"

Solution: Check config.yaml

VECTOR_SEARCH_ENABLED: true

How It Works¶

Bot receives document from user
Bot calls reindex_vector through MCP Hub
MCP Hub sends text to Infinity to create embeddings
MCP Hub saves vectors to Qdrant
When searching, Bot calls vector_search
MCP Hub gets query embedding from Infinity
MCP Hub searches similar vectors in Qdrant
Bot receives relevant results

Each user's knowledge base → separate collection in Qdrant.

Additional Documentation¶

Complete documentation: Docker Vector Search Setup

AICODE-NOTE¶

Bot manages vectorization logic (when to index, when to search)
MCP Hub provides tools (vector_search, reindex_vector)
Infinity generates embeddings (converts text to vectors)
Qdrant stores vectors (separate collection for each knowledge base)