90x faster than pgvector — Lantern's HNSW Index Creation Time

October 20, 2023 · 8 min read

Varik Matevosyan

Varik Matevosyan

Software Engineer

Lantern is a Postgres extension to enable performant vector search using an index. Lantern is built using Usearch, an optimized C++ implementation of the HNSW algorithm, the most performant algorithm for vector search.

In this post we discuss the significance of index creation times and how Lantern enables 90x faster index creation times than pgvector, another popular vector search Postgres extension, using external indexing. We also compare Lantern's performance against Pinecone, a popular, closed-source hosted vector database.

Why fast index creation matters

Index creation time affects how quickly a developer can add vector search to their data. On a single core, generating an index on 1 billion vectors could take days or even weeks.

It also affects how quickly a developer can experiment with different parameters to optimize their index. HNSW relies on two parameters for index construction:

  • The M parameter controls the number of neighbors that each node in the graph will maintain. Higher values lead to longer construction times, longer query times, and higher memory usage, but result in a higher-quality index.
  • The ef_construction parameter determines how many nodes will be traversed during index construction. Higher values lead to longer construction times but result in a higher-quality index.

Generally, there is a tradeoff between recall and latency. The ideal set of parameters requires experimentation to find. It depends on the application's recall / latency needs and the data itself (what recall is possible given the data distribution). This experimentation could become untenable with slow index creation times.

How external index creation enables parallelism

With Postgres, HNSW index creation is single-threaded. This prevents the utilization of multiple cores to speed up index creation. In addition, the index creation process is resource-intensive, which can slow down other database operations. The latter would pose a problem even if HNSW index creation were multi-threaded.

Lantern allows developers to create an index externally, and then import the index as a file into their database. With external index creation, the core database remains unburdened during index creation, and the index can be created using multiple cores. This enables significant performance improvements.

Below we show the results of two sets of experiments with Lantern: one with index creation occurring inside Postgres, and one with index creation occurring externally. We compare Lantern's performance against pgvector and Pinecone.

Lantern's single-core index creation performance

Experiment Setup

We use the following datasets

  • sift - 1 million vectors of 128 dimensions, downloadable here
  • wiki - 1 million vectors of 1536 dimensions generated using the text-embedding-ada-002 model

The experiments were run on a Linode instance with 32 Cores and 64GB of RAM.

The SQL to create a table, copy the data, and create the index using Lantern follows

sql
Copy
CREATE TABLE wiki1m (id SERIAL, v REAL[]);
COPY wiki1m (v) FROM '/tmp/wiki1m.csv' WITH CSV;
CREATE INDEX ON wiki1m USING lantern_hnsw (v) WITH (dim=128, m=8, ef_construction=128, ef=128);

Table: Index Creation Times for Sift

Vector Database

Time

Vec/s

Lantern

8m 30s

~1960 vec/s

Pinecone (p2x2 - 1 pod)

9m

~1818 vec/s

Pgvector

46m

~361 vec/s

Table: Index Creation Times for Wiki

Vector Database

Time

Vec/s

Lantern

44m

~382 vec/s

Pinecone (p2x2 - 1 pod)

30m

~555 vec/s

Pgvector

2h

~140 vec/s

Graph #1: Index Creation Speed

Index Creation Speed (Single Core)

Using external index creation, Lantern is 90x faster than pgvector

Results Overview

  • 17x performance improvement compared to creating the index on a single thread.
  • 90x performance improvement compared to pgvector, and a 6x improvement over Pinecone for sift dataset
  • 48x performance improvement over pgvector, as well as a 3x improvement over Pinecone with 32 pods
  • Pinecone index on 32 p2 pods costs $3,889.44 / month. 32 CPU Linode costs $576 / month. Lantern is over 6x cheaper and 6x faster!

Graph #2: Index Creation Speed with 32 Cores

Index Creation Speed (Multi Core)

Graph #3: Index Creation Speed with 2 - 32 Cores

Index creation speed with Lantern Multi-Core Index

Fine-tuning the recall using parameters m=16, ef_construction=128, and ef=128 for Lantern, we can achieve a 99% recall@5 for sift dataset, with the index creation taking only 50 seconds.

Notice that a 2 Core Linode Server running Lantern outperforms Pinecone’s 32 pod cluster - this means that Lantern can be 60x cheaper than Pinecone for the same performance.

External index creation with Lantern

Implementation Details

Here is a brief overview of how Lantern's external index creation works under the hood.

We use the row's ctid as a label for our nodes during index creation. Later, this label is used to retrieve the actual row from Postgres, as it represents the physical location of the row.

We use Postgres's large object functionality for data transfer. We use the lo_export function to export the indexable data to the file system. After generating the index using Usearch on the file system, we then use the lo_import function to transfer the index file to the database server.

How to use external index creation

  1. Install the Lantern CLI
  2. Create an index from a regular Postgres table via the create-index utility

Here is a walk through:

First, install the Lantern CLI:

bash
Copy
cargo install --git https://github.com/lanterndata/lantern_extras --bin lantern-cli

Next, use the create-index utility to externally create and import our index to Postgres:

Note: If you encounter ONNXRuntime, issues, set up onnx runtime manually using these steps and export the required env variables.

bash
Copy
lantern-cli create-index \
    --uri postgresql://postgres@localhost:5432/testlive \
    --table sift \
    --column v \
    -m 8 \
    --efc 128 \
    --ef 64 \
    -d 128 \
    --metric-kind cos \
    --out index.usearch \
    --import

The --uri parameter specifies the database connection URI, and the --table and --column parameters specify the table and column to index. The index parameters are -m, --efc, --ef, -d (dimension of column), and --metric-kind (l2sq, cos, or hamming). The --out parameter specifies the output file name. If --import is also specified, the externally created index will be automatically imported to Postgres after creation, and the temporary output file will be removed.

The table will be exclusively locked during the index creation process to avoid inconsistency. To reindex the data, use select lantern_reindex_external_index('<index_name>'); if you have lantern_extras extension installed in your database, or use the same CLI command as above and provide the --index-name argument. In the latter case, the existing index will be dropped and a new one will be created.

Conclusion

In this post, we discussed the significance of index creation times, Lantern's external indexing process, and shared our benchmarking results against pgvector and Pinecone.

For a deeper dive into the code, you can check out our core repo or the Lantern Extras repo. Our core repo enables vector search in Postgres, while Lantern Extras provides routines for external index generation and embedding management.

Revisions

January 4, 2024 - Correction to Graph #3

While re-running benchmarks, we observed that the previous graph for multi-core indexing performance was generated incorrectly. It indicated that external index generation was faster on 1-8 cores than it was. The graph has been updated above to reflect the correct data. No other data in the article has changed.

January 4, 2024 - Additional benchmarks

In response to a comment on HackerNews on January 3, 2024:

This post and experiments were created before pgvector@0.5.1 was released and are based on pgvector@0.5.0. Significant work was done as part of pgvector@0.5.1 release to optimize index creation time. As such, we re-ran pgvector and Lantern benchmarks with database parameter tuning, as requested.

Pgvector

Lantern

Wiki

36m

43m

Sift

12m 30s

7m

This used the following database parameters for both Lantern and pgvector

  • shared_buffers=12GB
  • maintenance_work_mem=5GB
  • work_mem=2GB

The database parameters for the previous results were the Postgres defaults for both Lantern and pgvector.

In both experiments, benchmarking was done using psql's \timing function and used the 32CPU/64GB RAM machine Linode Dedicated 64.

Share this post