Home Posts pgvector Migration: Legacy SQL to Vector-Native Postgres
System Architecture

pgvector Migration: Legacy SQL to Vector-Native Postgres

pgvector Migration: Legacy SQL to Vector-Native Postgres
Dillip Chowdary
Dillip Chowdary
Tech Entrepreneur & Innovator · May 04, 2026 · 9 min read

Bottom Line

Do not wait for a hypothetical pgvector 0.9 release to modernize your stack. As of May 04, 2026, the latest official tag is 0.8.2, and it is already enough to add production-grade vector search to PostgreSQL with a low-risk migration path.

Key Takeaways

  • As of May 04, 2026, the latest official pgvector tag is 0.8.2, not 0.9.
  • Start with HNSW for the best speed/recall tradeoff; use IVFFlat when build time and memory matter more.
  • Backfill embeddings first, then create the ANN index, then switch reads after EXPLAIN proves index usage.
  • Filtered vector queries often need higher hnsw.ef_search or iterative scans to keep recall high.

If your app still relies on LIKE, ILIKE, or manual ranking tables, you can move to vector search without abandoning PostgreSQL. One important correction first: as of May 04, 2026, there is no official pgvector 0.9 tag yet. The latest official tag is v0.8.2, and that is what this tutorial uses. The migration pattern is still the right one for future releases: add vectors beside relational data, backfill safely, index deliberately, and cut traffic over only after you can prove recall and latency.

Prerequisites

Bottom Line

Use pgvector 0.8.2 today, keep your existing PostgreSQL tables, and treat vector search as a schema migration plus a query-planner validation exercise.

Before you start

  • PostgreSQL 13+, which pgvector currently supports in the official README.
  • A staging or production-safe rollout path for CREATE INDEX CONCURRENTLY.
  • An embedding pipeline in your application or ETL job. This guide focuses on the database side.
  • A representative query set so you can compare legacy relevance, latency, and recall.
  • A rollback plan that keeps old reads available until vector search is proven.
  • If legacy text contains sensitive data, mask it before embedding generation with the Data Masking Tool.

What changes in practice

  • Your relational schema stays in PostgreSQL.
  • You add a vector column to the table that powers search.
  • You generate embeddings outside Postgres and write them back into the row.
  • You replace text-only ranking with nearest-neighbor queries plus your existing filters.

Plan the migration

Choose the right distance and index

  • Use cosine distance when your embedding model is evaluated that way and your app thinks in similarity scores.
  • Use L2 or inner product only if your model documentation or current ranking logic requires them.
  • Start with HNSW when query latency and recall matter most.
  • Choose IVFFlat when index build time and memory pressure are the bigger constraint.

Keep legacy search online during the rollout

  1. Add the vector column without changing reads.
  2. Backfill embeddings in batches.
  3. Create the ANN index after most rows are populated.
  4. Run dual reads in staging or a shadow path.
  5. Switch production reads only after EXPLAIN ANALYZE and relevance checks look correct.
Pro tip: If your current search query includes strict business filters like tenant, status, or locale, keep those relational filters in SQL. pgvector works best as a ranking layer, not as a replacement for normal predicates.

Migrate schema and backfill

Step 1: install or upgrade the extension

Use the latest official tagged release available on May 04, 2026:

cd /tmp
git clone --branch v0.8.2 https://github.com/pgvector/pgvector.git
cd pgvector
make
make install

Then enable it inside the target database:

CREATE EXTENSION IF NOT EXISTS vector;
SELECT extversion FROM pg_extension WHERE extname = 'vector';

Step 2: add vector columns to the legacy table

Use a dimension that matches your embedding model. The example below uses vector(3) to keep the SQL readable.

BEGIN;

ALTER TABLE articles
  ADD COLUMN embedding vector(3),
  ADD COLUMN embedding_updated_at timestamptz;

COMMIT;

Step 3: load embeddings into a staging table

Backfilling through a staging table is easier to retry and audit than updating rows one by one from application code.

CREATE TABLE article_embeddings_stage (
  id bigint PRIMARY KEY,
  embedding vector(3) NOT NULL
);

INSERT INTO article_embeddings_stage (id, embedding)
VALUES
  (1, '[0.12,0.44,0.81]'),
  (2, '[0.15,0.39,0.77]'),
  (3, '[0.91,0.08,0.11]');

Step 4: backfill the main table in SQL

UPDATE articles a
SET embedding = s.embedding,
    embedding_updated_at = now()
FROM article_embeddings_stage s
WHERE a.id = s.id;

SELECT COUNT(*) AS total_rows,
       COUNT(*) FILTER (WHERE embedding IS NOT NULL) AS embedded_rows,
       COUNT(*) FILTER (WHERE embedding IS NULL) AS missing_rows
FROM articles;
Watch out: Every value written into a vector(n) column must have exactly n dimensions. A mismatch during backfill is the fastest way to turn a clean migration into a batch failure.

Add indexes and cut over

Step 5: create the first ANN index

For most semantic search migrations, start with HNSW and vectorcosineops.

CREATE INDEX CONCURRENTLY articles_embedding_hnsw_idx
ON articles
USING hnsw (embedding vector_cosine_ops);

If you are indexing a very large table, create the index after the bulk of your backfill is complete. The pgvector README also notes that HNSW builds faster when the graph fits in maintenanceworkmem.

SET maintenance_work_mem = '8GB';
SET max_parallel_maintenance_workers = 7;

Step 6: switch reads to nearest-neighbor ranking

SELECT id,
       title,
       1 - (embedding <=> '[0.14,0.41,0.79]') AS cosine_similarity
FROM articles
WHERE status = 'published'
  AND embedding IS NOT NULL
ORDER BY embedding <=> '[0.14,0.41,0.79]'
LIMIT 10;

If filtered queries return too few matches, increase the HNSW search candidate list for that request:

BEGIN;
SET LOCAL hnsw.ef_search = 100;
SELECT id, title
FROM articles
WHERE status = 'published'
ORDER BY embedding <=> '[0.14,0.41,0.79]'
LIMIT 10;
COMMIT;

Step 7: keep IVFFlat as the fallback option

Use IVFFlat when build speed or memory footprint is more important than the best possible speed-recall tradeoff.

CREATE INDEX CONCURRENTLY articles_embedding_ivfflat_idx
ON articles
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

SET ivfflat.probes = 10;
  • Choose HNSW when search quality under low latency is the priority.
  • Choose IVFFlat when you need a lighter index build path and can tune probes later.

Verification and expected output

Verify the extension version

SELECT extversion
FROM pg_extension
WHERE extname = 'vector';

Expected output:

 extversion
------------
 0.8.2

Verify the backfill completed

SELECT COUNT(*) FILTER (WHERE embedding IS NULL) AS missing_rows
FROM articles;

Expected output:

 missing_rows
-------------
 0

Verify that PostgreSQL uses the vector index

EXPLAIN ANALYZE
SELECT id, title
FROM articles
WHERE status = 'published'
ORDER BY embedding <=> '[0.14,0.41,0.79]'
LIMIT 10;

What you want to see:

  • An Index Scan or similar plan node referencing articlesembeddinghnsw_idx or articlesembeddingivfflat_idx.
  • Lower latency than your exact scan baseline.
  • Result quality that is acceptable against your shadow or offline relevance set.

Troubleshooting top 3

  1. Dimension mismatch during backfill. If your column is vector(768), every embedding must have exactly 768 values. Fix this in the embedding pipeline first; do not patch around it in SQL.
  2. Filtered queries return too few results. Approximate indexes apply filters after scanning candidates. Raise hnsw.ef_search, and if you are on 0.8.0+, consider SET hnsw.iterative_scan = strict_order; for filtered workloads that need more recall.
  3. Index builds are too slow. Build after bulk loading, increase maintenance_work_mem carefully, and raise max_parallel_maintenance_workers when the server has room. If HNSW is still too expensive, test IVFFlat instead.

What's next

  • Add relevance evaluation to CI or your release checklist so vector search does not quietly regress.
  • Test partial indexes or partitioning if your workload is heavily filtered by tenant or category.
  • Consider halfvec if index size becomes the next bottleneck.
  • Track future pgvector tags directly before upgrading; do not plan around untagged releases.

Primary sources for this guide: pgvector README and the official tags page.

Frequently Asked Questions

Can pgvector replace my legacy SQL search stack without moving to a separate vector database? +
Often, yes. If your workload already lives in PostgreSQL and your filters, joins, and transactional writes matter, pgvector lets you add vector ranking without splitting the system. You still need to validate recall and latency, but the operational model stays much simpler than adding a separate store.
Should I start with HNSW or IVFFlat in pgvector? +
Start with HNSW for most production semantic search rollouts because it offers the best speed-recall tradeoff. Use IVFFlat when build time and memory pressure matter more, or when you want a lighter first step before tuning lists and probes.
Why is PostgreSQL ignoring my pgvector index? +
The planner will usually need the query shape it expects: an ORDER BY on the vector distance operator plus a LIMIT. If you remove the ordering, use the wrong operator class, or set IVFFlat probes equal to the number of lists, PostgreSQL may choose a different plan.
Can I upgrade pgvector without regenerating embeddings? +
Usually yes. Embeddings are just values stored in your table, so upgrading the extension normally means installing the new build and running ALTER EXTENSION vector UPDATE;. You only need to regenerate embeddings if you also change models, dimensions, or your application-side embedding strategy.

Get Engineering Deep-Dives in Your Inbox

Weekly breakdowns of architecture, security, and developer tooling — no fluff.

Found this useful? Share it.