Vector Databases

What it is

A vector turns each piece of text, image, or other data into a long list of numbers that captures its meaning. Those number-lists are called high-dimensional vectors (embeddings). When you search, it finds the items whose meaning is closest to your query. Think of a librarian who hands you books on the same topic even if they use completely different words. Instead of matching exact keywords, it ranks results by how close their meaning is (semantic closeness), measured with distance metrics like cosine similarity. They power semantic search, recommendations, and retrieval-augmented generation (RAG) for AI apps.

Strengths

Fast similarity search over millions of embeddings via ANN (Approximate Nearest Neighbor) indexes.
Enables semantic search and RAG that understands meaning, not just keywords.
Many options: pgvector, Pinecone, Qdrant, Weaviate, Milvus, Chroma.
Metadata filtering combines semantic search with structured constraints.
Scales to large corpora with approximate-nearest-neighbor algorithms.

Trade-offs

Results are approximate; tuning trades recall against speed.
Embedding quality depends entirely on the model that produced the vectors.
Re-embedding the whole corpus is required if you change models.
Storage and memory grow with dimension count and dataset size.

When to use it

Use a vector database for semantic search, document Q&A, recommendations, deduplication, and RAG pipelines that feed relevant context into an LLM (Large Language Model). For modest scale, the pgvector extension lets you keep vectors alongside relational data in PostgreSQL.

Vibe coding fit

When directing AI, be explicit about three things: which embedding model you use (so vector dimensions match the index), the distance metric (cosine is typical for text), and how you chunk source documents before embedding. Ask the AI to store useful metadata next to each vector so you can filter (for example by user_id or doc_type) and to keep embedding and query code using the same model. A good tip: have the AI build a RAG retrieval step that returns the top-k chunks plus their source ids. That way answers stay traceable, and you can verify what context the model actually used.

-- PostgreSQL with the pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE documents (
  id        BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  content   TEXT NOT NULL,
  doc_type  TEXT,
  embedding VECTOR(1536)            -- must match your embedding model
);

-- Approximate-nearest-neighbor index for cosine distance
CREATE INDEX ON documents
  USING hnsw (embedding vector_cosine_ops);

-- Top-5 most similar chunks to a query embedding ($1)
SELECT id, content
FROM documents
WHERE doc_type = 'manual'
ORDER BY embedding <=> $1
LIMIT 5;