pgvector Migration: Legacy SQL to Vector-Native Postgres
Bottom Line
Do not wait for a hypothetical pgvector 0.9 release to modernize your stack. As of May 04, 2026, the latest official tag is 0.8.2, and it is already enough to add production-grade vector search to PostgreSQL with a low-risk migration path.
Key Takeaways
- ›As of May 04, 2026, the latest official pgvector tag is 0.8.2, not 0.9.
- ›Start with HNSW for the best speed/recall tradeoff; use IVFFlat when build time and memory matter more.
- ›Backfill embeddings first, then create the ANN index, then switch reads after EXPLAIN proves index usage.
- ›Filtered vector queries often need higher hnsw.ef_search or iterative scans to keep recall high.
If your app still relies on LIKE, ILIKE, or manual ranking tables, you can move to vector search without abandoning PostgreSQL. One important correction first: as of May 04, 2026, there is no official pgvector 0.9 tag yet. The latest official tag is v0.8.2, and that is what this tutorial uses. The migration pattern is still the right one for future releases: add vectors beside relational data, backfill safely, index deliberately, and cut traffic over only after you can prove recall and latency.
Prerequisites
Bottom Line
Use pgvector 0.8.2 today, keep your existing PostgreSQL tables, and treat vector search as a schema migration plus a query-planner validation exercise.
Before you start
- PostgreSQL 13+, which pgvector currently supports in the official README.
- A staging or production-safe rollout path for CREATE INDEX CONCURRENTLY.
- An embedding pipeline in your application or ETL job. This guide focuses on the database side.
- A representative query set so you can compare legacy relevance, latency, and recall.
- A rollback plan that keeps old reads available until vector search is proven.
- If legacy text contains sensitive data, mask it before embedding generation with the Data Masking Tool.
What changes in practice
- Your relational schema stays in PostgreSQL.
- You add a vector column to the table that powers search.
- You generate embeddings outside Postgres and write them back into the row.
- You replace text-only ranking with nearest-neighbor queries plus your existing filters.
Plan the migration
Choose the right distance and index
- Use cosine distance when your embedding model is evaluated that way and your app thinks in similarity scores.
- Use L2 or inner product only if your model documentation or current ranking logic requires them.
- Start with HNSW when query latency and recall matter most.
- Choose IVFFlat when index build time and memory pressure are the bigger constraint.
Keep legacy search online during the rollout
- Add the vector column without changing reads.
- Backfill embeddings in batches.
- Create the ANN index after most rows are populated.
- Run dual reads in staging or a shadow path.
- Switch production reads only after EXPLAIN ANALYZE and relevance checks look correct.
Migrate schema and backfill
Step 1: install or upgrade the extension
Use the latest official tagged release available on May 04, 2026:
cd /tmp
git clone --branch v0.8.2 https://github.com/pgvector/pgvector.git
cd pgvector
make
make installThen enable it inside the target database:
CREATE EXTENSION IF NOT EXISTS vector;
SELECT extversion FROM pg_extension WHERE extname = 'vector';Step 2: add vector columns to the legacy table
Use a dimension that matches your embedding model. The example below uses vector(3) to keep the SQL readable.
BEGIN;
ALTER TABLE articles
ADD COLUMN embedding vector(3),
ADD COLUMN embedding_updated_at timestamptz;
COMMIT;Step 3: load embeddings into a staging table
Backfilling through a staging table is easier to retry and audit than updating rows one by one from application code.
CREATE TABLE article_embeddings_stage (
id bigint PRIMARY KEY,
embedding vector(3) NOT NULL
);
INSERT INTO article_embeddings_stage (id, embedding)
VALUES
(1, '[0.12,0.44,0.81]'),
(2, '[0.15,0.39,0.77]'),
(3, '[0.91,0.08,0.11]');Step 4: backfill the main table in SQL
UPDATE articles a
SET embedding = s.embedding,
embedding_updated_at = now()
FROM article_embeddings_stage s
WHERE a.id = s.id;
SELECT COUNT(*) AS total_rows,
COUNT(*) FILTER (WHERE embedding IS NOT NULL) AS embedded_rows,
COUNT(*) FILTER (WHERE embedding IS NULL) AS missing_rows
FROM articles;Add indexes and cut over
Step 5: create the first ANN index
For most semantic search migrations, start with HNSW and vectorcosineops.
CREATE INDEX CONCURRENTLY articles_embedding_hnsw_idx
ON articles
USING hnsw (embedding vector_cosine_ops);If you are indexing a very large table, create the index after the bulk of your backfill is complete. The pgvector README also notes that HNSW builds faster when the graph fits in maintenanceworkmem.
SET maintenance_work_mem = '8GB';
SET max_parallel_maintenance_workers = 7;Step 6: switch reads to nearest-neighbor ranking
SELECT id,
title,
1 - (embedding <=> '[0.14,0.41,0.79]') AS cosine_similarity
FROM articles
WHERE status = 'published'
AND embedding IS NOT NULL
ORDER BY embedding <=> '[0.14,0.41,0.79]'
LIMIT 10;If filtered queries return too few matches, increase the HNSW search candidate list for that request:
BEGIN;
SET LOCAL hnsw.ef_search = 100;
SELECT id, title
FROM articles
WHERE status = 'published'
ORDER BY embedding <=> '[0.14,0.41,0.79]'
LIMIT 10;
COMMIT;Step 7: keep IVFFlat as the fallback option
Use IVFFlat when build speed or memory footprint is more important than the best possible speed-recall tradeoff.
CREATE INDEX CONCURRENTLY articles_embedding_ivfflat_idx
ON articles
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
SET ivfflat.probes = 10;- Choose HNSW when search quality under low latency is the priority.
- Choose IVFFlat when you need a lighter index build path and can tune probes later.
Verification and expected output
Verify the extension version
SELECT extversion
FROM pg_extension
WHERE extname = 'vector';Expected output:
extversion
------------
0.8.2Verify the backfill completed
SELECT COUNT(*) FILTER (WHERE embedding IS NULL) AS missing_rows
FROM articles;Expected output:
missing_rows
-------------
0Verify that PostgreSQL uses the vector index
EXPLAIN ANALYZE
SELECT id, title
FROM articles
WHERE status = 'published'
ORDER BY embedding <=> '[0.14,0.41,0.79]'
LIMIT 10;What you want to see:
- An Index Scan or similar plan node referencing articlesembeddinghnsw_idx or articlesembeddingivfflat_idx.
- Lower latency than your exact scan baseline.
- Result quality that is acceptable against your shadow or offline relevance set.
Troubleshooting top 3
- Dimension mismatch during backfill. If your column is
vector(768), every embedding must have exactly 768 values. Fix this in the embedding pipeline first; do not patch around it in SQL. - Filtered queries return too few results. Approximate indexes apply filters after scanning candidates. Raise
hnsw.ef_search, and if you are on 0.8.0+, considerSET hnsw.iterative_scan = strict_order;for filtered workloads that need more recall. - Index builds are too slow. Build after bulk loading, increase
maintenance_work_memcarefully, and raisemax_parallel_maintenance_workerswhen the server has room. If HNSW is still too expensive, test IVFFlat instead.
What's next
- Add relevance evaluation to CI or your release checklist so vector search does not quietly regress.
- Test partial indexes or partitioning if your workload is heavily filtered by tenant or category.
- Consider halfvec if index size becomes the next bottleneck.
- Track future pgvector tags directly before upgrading; do not plan around untagged releases.
Primary sources for this guide: pgvector README and the official tags page.
Frequently Asked Questions
Can pgvector replace my legacy SQL search stack without moving to a separate vector database? +
Should I start with HNSW or IVFFlat in pgvector? +
lists and probes.Why is PostgreSQL ignoring my pgvector index? +
ORDER BY on the vector distance operator plus a LIMIT. If you remove the ordering, use the wrong operator class, or set IVFFlat probes equal to the number of lists, PostgreSQL may choose a different plan.Can I upgrade pgvector without regenerating embeddings? +
ALTER EXTENSION vector UPDATE;. You only need to regenerate embeddings if you also change models, dimensions, or your application-side embedding strategy.Get Engineering Deep-Dives in Your Inbox
Weekly breakdowns of architecture, security, and developer tooling — no fluff.