Lantern is a Postgres extension to enable performant vector search using an index. Lantern is built using Usearch, an optimized C++ implementation of the HNSW algorithm, the most performant algorithm for vector search.
In this post we discuss the significance of index creation times and how Lantern enables 90x faster index creation times than pgvector, another popular vector search Postgres extension, using external indexing. We also compare Lantern's performance against Pinecone, a popular, closed-source hosted vector database.
Why fast index creation matters
Index creation time affects how quickly a developer can add vector search to their data. On a single core, generating an index on 1 billion vectors could take days or even weeks.
It also affects how quickly a developer can experiment with different parameters to optimize their index. HNSW relies on two parameters for index construction:
- The
M
parameter controls the number of neighbors that each node in the graph will maintain. Higher values lead to longer construction times, longer query times, and higher memory usage, but result in a higher-quality index. - The
ef_construction
parameter determines how many nodes will be traversed during index construction. Higher values lead to longer construction times but result in a higher-quality index.
Generally, there is a tradeoff between recall and latency. The ideal set of parameters requires experimentation to find. It depends on the application's recall / latency needs and the data itself (what recall is possible given the data distribution). This experimentation could become untenable with slow index creation times.
How external index creation enables parallelism
With Postgres, HNSW index creation is single-threaded. This prevents the utilization of multiple cores to speed up index creation. In addition, the index creation process is resource-intensive, which can slow down other database operations. The latter would pose a problem even if HNSW index creation were multi-threaded.
Lantern allows developers to create an index externally, and then import the index as a file into their database. With external index creation, the core database remains unburdened during index creation, and the index can be created using multiple cores. This enables significant performance improvements.
Below we show the results of two sets of experiments with Lantern: one with index creation occurring inside Postgres, and one with index creation occurring externally. We compare Lantern's performance against pgvector and Pinecone.
Lantern's single-core index creation performance
Experiment Setup
We use the following datasets
- sift - 1 million vectors of 128 dimensions, downloadable here
- wiki - 1 million vectors of 1536 dimensions generated using the
text-embedding-ada-002
model
The experiments were run on a Linode instance with 32 Cores and 64GB of RAM.
The SQL to create a table, copy the data, and create the index using Lantern follows
CREATE TABLE wiki1m (id SERIAL, v REAL[]);
COPY wiki1m (v) FROM '/tmp/wiki1m.csv' WITH CSV;
CREATE INDEX ON wiki1m USING lantern_hnsw (v) WITH (dim=128, m=8, ef_construction=128, ef=128);
Table: Index Creation Times for Sift
Vector Database | Time | Vec/s |
---|---|---|
Lantern | 8m 30s | ~1960 vec/s |
Pinecone (p2x2 - 1 pod) | 9m | ~1818 vec/s |
Pgvector | 46m | ~361 vec/s |
Table: Index Creation Times for Wiki
Vector Database | Time | Vec/s |
---|---|---|
Lantern | 44m | ~382 vec/s |
Pinecone (p2x2 - 1 pod) | 30m | ~555 vec/s |
Pgvector | 2h | ~140 vec/s |
Graph #1: Index Creation Speed
Using external index creation, Lantern is 90x faster than pgvector
Results Overview
- 17x performance improvement compared to creating the index on a single thread.
- 90x performance improvement compared to pgvector, and a 6x improvement over Pinecone for sift dataset
- 48x performance improvement over pgvector, as well as a 3x improvement over Pinecone with 32 pods
- Pinecone index on 32 p2 pods costs $3,889.44 / month. 32 CPU Linode costs $576 / month. Lantern is over 6x cheaper and 6x faster!
Graph #2: Index Creation Speed with 32 Cores
Graph #3: Index Creation Speed with 2 - 32 Cores
Fine-tuning the recall using parameters m=16
, ef_construction=128
, and ef=128
for Lantern, we can achieve a 99% recall@5 for sift dataset, with the index creation taking only 50 seconds.
Notice that a 2 Core Linode Server running Lantern outperforms Pinecone’s 32 pod cluster - this means that Lantern can be 60x cheaper than Pinecone for the same performance.
External index creation with Lantern
Implementation Details
Here is a brief overview of how Lantern's external index creation works under the hood.
We use the row's ctid
as a label for our nodes during index creation. Later, this label is used to retrieve the actual row from Postgres, as it represents the physical location of the row.
We use Postgres's large object functionality for data transfer. We use the lo_export
function to export the indexable data to the file system. After generating the index using Usearch on the file system, we then use the lo_import
function to transfer the index file to the database server.
How to use external index creation
- Install the Lantern CLI
- Create an index from a regular Postgres table via the
create-index
utility
Here is a walk through:
First, install the Lantern CLI:
cargo install --git https://github.com/lanterndata/lantern_extras --bin lantern-cli
Next, use the create-index
utility to externally create and import our index to Postgres:
Note: If you encounter ONNXRuntime, issues, set up onnx runtime manually using these steps and export the required env variables.
lantern-cli create-index \
--uri postgresql://postgres@localhost:5432/testlive \
--table sift \
--column v \
-m 8 \
--efc 128 \
--ef 64 \
-d 128 \
--metric-kind cos \
--out index.usearch \
--import
The --uri
parameter specifies the database connection URI, and the --table
and --column
parameters specify the table and column to index. The index parameters are -m
, --efc
, --ef
, -d
(dimension of column), and --metric-kind
(l2sq
, cos
, or hamming
). The --out
parameter specifies the output file name. If --import
is also specified, the externally created index will be automatically imported to Postgres after creation, and the temporary output file will be removed.
The table will be exclusively locked during the index creation process to avoid inconsistency. To reindex the data, use select lantern_reindex_external_index('<index_name>')
; if you have lantern_extras
extension installed in your database, or use the same CLI command as above and provide the --index-name
argument. In the latter case, the existing index will be dropped and a new one will be created.
Conclusion
In this post, we discussed the significance of index creation times, Lantern's external indexing process, and shared our benchmarking results against pgvector and Pinecone.
For a deeper dive into the code, you can check out our core repo or the Lantern Extras repo. Our core repo enables vector search in Postgres, while Lantern Extras provides routines for external index generation and embedding management.
Revisions
January 4, 2024 - Correction to Graph #3
While re-running benchmarks, we observed that the previous graph for multi-core indexing performance was generated incorrectly. It indicated that external index generation was faster on 1-8 cores than it was. The graph has been updated above to reflect the correct data. No other data in the article has changed.
January 4, 2024 - Additional benchmarks
In response to a comment on HackerNews on January 3, 2024:
This post and experiments were created before pgvector@0.5.1
was released and are based on pgvector@0.5.0
. Significant work was done as part of pgvector@0.5.1
release to optimize index creation time. As such, we re-ran pgvector
and Lantern benchmarks with database parameter tuning, as requested.
Pgvector | Lantern | |
---|---|---|
Wiki | 36m | 43m |
Sift | 12m 30s | 7m |
This used the following database parameters for both Lantern and pgvector
- shared_buffers=12GB
- maintenance_work_mem=5GB
- work_mem=2GB
The database parameters for the previous results were the Postgres defaults for both Lantern and pgvector
.
In both experiments, benchmarking was done using psql
's \timing
function and used the 32CPU/64GB RAM machine Linode Dedicated 64.