pgvector on Supabase vs a Purpose-Built Agent Database

TL;DR

pgvector on Supabase is a vector search extension running on CPU inside a web-app-oriented Postgres platform. It works for simple RAG with small datasets. For production agent workloads - fleet-scale concurrency, GPU-accelerated search, per-agent isolation, scale-to-zero - you need a purpose-built agent database. Deeplake is that database.

Overview

pgvector is the most popular way to add vector search to Postgres. Supabase makes it easy to enable with a single command. For a proof of concept or a web app with light AI features, this combination is convenient and functional.

But "convenient and functional" breaks down quickly when agents become the primary workload. pgvector is CPU-bound, Supabase is always-on, there's no per-agent isolation, and the entire architecture was designed for web applications, not autonomous AI systems that spin up, execute, and tear down at machine speed.

pgvector on Supabase: What You Get

sql

-- Enable pgvector on Supabase
CREATE EXTENSION vector;
 
-- Create a table with embeddings
CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    content TEXT,
    embedding vector(1536)
);
 
-- Create an index
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
 
-- Query
SELECT content, embedding <-> '[0.1, 0.2, ...]'::vector AS distance
FROM documents
ORDER BY embedding <-> '[0.1, 0.2, ...]'::vector
LIMIT 10;

This works. For a single user, a small dataset, and infrequent queries.

Where pgvector on Supabase Breaks

1. CPU-Bound Vector Search

pgvector runs on CPU. Vector similarity search is a massively parallel operation - comparing a query vector against millions of stored vectors. CPU processes these sequentially (or with limited SIMD parallelism). GPU processes them in true parallel.

Dataset Size	pgvector (CPU)	Deeplake (GPU)
100K vectors	~10ms	~2ms
1M vectors	~50ms	~5ms
10M vectors	~200ms+	~20ms
1M vectors, 50 concurrent	~300ms+	~8ms

2. No Agent Isolation

Supabase uses Row-Level Security (RLS) for tenant isolation. This is great for web apps where each user sees their own rows. It's insufficient for agents that need sandbox-level isolation - where one agent's operations cannot possibly interfere with another's.

python

# pgvector on Supabase: shared table, RLS for "isolation"
# Agent A and Agent B query the same table
# If Agent A's write is slow, it can lock rows Agent B needs
 
# Deeplake: true branch isolation
import deeplake
 
agent_a = deeplake.connect("production", branch="agent-a-session")
agent_b = deeplake.connect("production", branch="agent-b-session")
# Completely independent  -  no shared locks, no interference

3. No Scale-to-Zero

Supabase projects run continuously. If your agents are active for 2 hours out of 24, you're paying for 22 hours of idle time. Deeplake scales to zero and provisions back in ~200ms.

4. Connection Limits Under Agent Load

Supabase's Postgres instance has connection limits. Agent workloads create and destroy connections rapidly. At fleet scale:

50 agents with 2 connections each = 100 connections
Supabase free tier: 60 connections max
Supabase pro tier: Still limited

Deeplake's branch-per-agent model doesn't exhaust a shared connection pool.

5. Index Maintenance Overhead

pgvector's IVFFlat index requires periodic rebuilding as data grows. In production, this means:

sql

-- Rebuild index (locks table during rebuild on some configurations)
REINDEX INDEX documents_embedding_idx;

For agents that continuously write embeddings, index maintenance becomes a recurring operational burden.

Purpose-Built: What Deeplake Provides

python

import deeplake
 
# Serverless, GPU-native, branch-per-agent
db = deeplake.connect("agent-platform", branch="research-agent-4417")
 
# Write embeddings  -  no index rebuild needed
db.execute("""
    INSERT INTO knowledge (source, content, embedding, metadata)
    VALUES (%s, %s, %s, %s)
""", [source_url, content, embedding_vector, metadata_json])
 
# GPU-accelerated vector search with SQL filters
results = db.execute("""
    SELECT source, content, embedding <-> %s AS relevance
    FROM knowledge
    WHERE metadata->>'domain' = 'engineering'
      AND created_at > NOW() - INTERVAL '7 days'
    ORDER BY embedding <-> %s
    LIMIT 15
""", [query_embedding, query_embedding])
 
# Structured queries alongside vector search
stats = db.execute("""
    SELECT COUNT(*), AVG(token_count)
    FROM knowledge
    WHERE source LIKE %s
""", ["%arxiv%"])

Side-by-Side Comparison

Aspect	pgvector on Supabase	Deeplake
Vector search engine	Extension (CPU)	Core (GPU)
Query language	SQL (Postgres)	SQL (Postgres-compatible)
Provisioning	Minutes (new project)	~200ms (new branch)
Agent isolation	RLS (row-level)	Branch (sandbox-level)
Scale to zero	No	Yes
Concurrent agents	Connection-limited	Branch-isolated
Multimodal	BLOBs in Supabase Storage	Native, queryable
Index maintenance	Manual rebuild needed	Automatic
GPU acceleration	No	Native
Cost when idle	Full project cost	Zero

The Migration Path

Because Deeplake is Postgres-compatible, migrating from pgvector on Supabase is straightforward:

SQL queries - Work as-is (Deeplake understands pgvector syntax)
ORMs - SQLAlchemy, Prisma, etc. connect the same way
Embeddings - Same vector dimensions, same distance metrics
Application code - Change the connection string, keep the queries

When pgvector on Supabase Is Enough

Prototype or MVP with < 100K embeddings
Web app with light AI features (not agent-primary)
Single-agent, low-concurrency use case
You need Supabase's auth, storage, and real-time features

When You Need Deeplake

Production agent systems at any scale
Fleet deployment with concurrent agents
Vector datasets > 1M embeddings
Bursty workloads requiring scale-to-zero
Per-agent sandbox isolation
GPU-accelerated performance requirements

Citations

The database for the agentic era

Get started with Deeplake

pgvector on Supabase vs a Purpose-Built Agent Database

pgvector on Supabase vs a Purpose-Built Agent Database

TL;DR

Overview

pgvector on Supabase: What You Get

Where pgvector on Supabase Breaks

1. CPU-Bound Vector Search

2. No Agent Isolation

3. No Scale-to-Zero

4. Connection Limits Under Agent Load

5. Index Maintenance Overhead

Purpose-Built: What Deeplake Provides

Side-by-Side Comparison

The Migration Path

When pgvector on Supabase Is Enough

When You Need Deeplake

Citations

The database for the agentic era

Related