Storage & Search

Vector Databases

Specialized databases for storing and searching high-dimensional embeddings

Pinecone

Fully managed vector database with excellent developer experience. Best for teams wanting minimal infrastructure overhead.

Managed

Weaviate

Open-source vector database with native hybrid search. Supports both dense vectors and keyword search in one query.

Open Source

Milvus

Highly scalable open-source vector database. Excellent for large-scale deployments with billions of vectors.

Open Source

Qdrant

Rust-based vector database with strong performance characteristics. Good balance of speed and features.

Open Source

Chroma

Lightweight, developer-friendly embedding database. Excellent for prototyping and smaller deployments.

Open Source

pgvector

PostgreSQL extension for vector similarity search. Add vector capabilities to existing Postgres deployments.

Extension
Representations

Embedding Models

Models for converting text to dense vector representations

OpenAI Embeddings

text-embedding-3-small and text-embedding-3-large. Industry-leading quality with variable dimension sizes.

API

Cohere Embed

embed-v3 with excellent multilingual support. Strong performance across diverse languages and domains.

API

BGE (BAAI)

BAAI General Embedding models. Top open-source option with multiple sizes from small to large.

Open Source

E5

Microsoft's embedding models with strong benchmark performance. Good instruction-following capability.

Open Source

Voyage AI

Specialized embedding models for code, legal, and other specific domains. Domain expertise built-in.

API

Instructor

Task-specific embeddings with instruction prefixes. Customize embeddings for your specific use case.

Open Source
Development

RAG Frameworks

Tools for building retrieval-augmented generation systems

LangChain

Comprehensive framework for LLM applications. Rich ecosystem of integrations and abstractions.

Python/JS

LlamaIndex

Data framework for LLM applications. Strong focus on indexing strategies and query optimization.

Python

Haystack

End-to-end framework for building NLP pipelines. Strong RAG and search capabilities.

Python

DSPy

Programming framework for LLM applications with automatic prompt optimization.

Python

Semantic Kernel

Microsoft's SDK for integrating AI into applications. Strong enterprise integration.

C#/Python

Vercel AI SDK

TypeScript toolkit for building AI-powered web apps. Great for React/Next.js applications.

TypeScript
Models

LLM Providers

Major providers of large language model APIs and platforms

OpenAI

GPT-4, GPT-4 Turbo, and GPT-3.5. Industry leader with extensive capabilities.

Anthropic

Claude 3 family (Opus, Sonnet, Haiku). Known for safety and long context.

Google

Gemini Pro and Ultra. Deep Google Cloud integration.

Meta

LLaMA models. Open weights enabling self-hosting and fine-tuning.