Blog

Topics include vector search algorithms, use cases and applications, tutorials, templates, performance, and benchmarks

Postgres·Pgvector

CREATE INDEX EXTERNALLY: Offloading pgvector Indexing from Postgres

We introduce external indexing for pgvector in Postgres to reduce performance issues caused by large dataset indexing. External indexing offloads the indexing process to external machines, reducing the impact on database performance.

October 1, 2024 · 6 min read

Varik Matevosyan

Varik Matevosyan

Software Engineer

Postgres·Pgvector

Understanding pgvector's HNSW Index Storage in Postgres

In this article, we'll explore how pgvector works under the hood, focusing on how the HNSW index is stored in Postgres.

August 19, 2024 · 8 min read

Varik Matevosyan

Varik Matevosyan

Software Engineer

Cloud·Postgres

Analysis of Key PostgreSQL Configuration Differences Among Cloud Providers

We dicuss some differences in PostgreSQL configuration parameters among cloud providers, why they matter, and their implications for users.

August 13, 2024 · 5 min read

Narek Galstyan

Narek Galstyan

Cofounder

Learn·Benchmark·Pgvector

Hybrid Vector Search in Postgres

Hybrid vector search combines the strengths of sparse and dense vector searches to improve search quality. We evaluate its performance on several datasets from the BEIR framework using Postgres.

August 5, 2024 · 8 min read

Di Qi

Di Qi

Cofounder

Pinecone·Postgres·Benchmark·Pgvector

Postgres vs. Pinecone

We respond to Pinecone's recent blog post comparing Postgres and Pinecone. We show that Postgres can outperform Pinecone in the same benchmarks Pinecone covered in their article.

July 18, 2024 · 11 min read

Narek Galstyan

Narek Galstyan

Cofounder

Pinecone

When using Pinecone, do not rely on returned scores - recalculate distances locally

When benchmarking Pinecone against Postgres, we noticed some inconsistencies with returned results. We recommend recalculating and re-sorting returned results locally when using Pinecone for nearest neighbor vector queries.

July 17, 2024 · 4 min read

Narek Galstyan

Narek Galstyan

Cofounder

Postgres

Dynamically loaded extensions in Postgres in the browser

We build on top of pglite to support loading Postgres extensions directly in the browser. We share a demo using geospatial search and vector search from Lantern, and discuss technical challenges we encountered.

July 12, 2024 · 6 min read

Varik Matevosyan

Varik Matevosyan

Software Engineer

Postgres·Learn·Permissions

Understanding Postgres Table Permissions

This blog post helps interpret Postgres's terse permission strings with examples and visuals.

July 10, 2024 · 7 min read

Narek Galstyan

Narek Galstyan

Cofounder

Postgres·Vector·Optimization

Use separate tables for asynchronous embedding generation

In this article, we'll explore a problem faced when extending traditional Postgres tables with an additional column for embeddings. We will describe a typical setting for asynchronous embedding generation and will show how table schema design can have a significant impact on storage and performance efficiency.

May 8, 2024 · 7 min read

Narek Galstyan

Narek Galstyan

Cofounder

Postgres·WAL·Index

Understanding and Estimating Write-Ahead Log (WAL) Size in Postgres

This blog post looks into the underlying plumbing of WAL records - when they are created, what information they contain, and how different database operations like indexing affect the kind of WAL records being created.

May 7, 2024 · 10 min read

Narek Galstyan

Narek Galstyan

Cofounder

Product

April 2024 - Engineering Updates

Engineering updates for April 2024, including asynchronous tasks, weighted vector search, and more.

May 1, 2024 · 2 min read

Di Qi

Di Qi

Cofounder

Benchmark·Product

Product Quantization in Postgres

We implemented product quantization in Lantern and benchmarked it using the LAION 100M 768-dimensional vector dataset.

March 5, 2024 · 8 min read

Di Qi

Di Qi

Cofounder

Varik Matevosyan

Varik Matevosyan

Software Engineer

Benchmark·OpenAI

Evaluating OpenAI's new embedding models with Lantern and Parea AI

OpenAI's newest embedding models promise huge performance increases. Using Lantern's Postgres vector database, and Parea AI's testing platform, we'll measure the new models in a real-world test.

February 29, 2024 · 6 min read

Di Qi

Di Qi

Cofounder

Learn

Vector databases explained

We give an overview of vector databases, and major concepts around them, including vector embeddings, vector indexing, and vector search.

March 29, 2024 · 12 min read

Di Qi

Di Qi

Cofounder

Search·Algolia·ElasticSearch

Postgres vs ElasticSearch vs Algolia - Comparing the Best Search Solutions

Don't overcomplicate things with an external search engine like ElasticSearch or Algolia. Postgres as your search engine and database makes things simple and scalable for a few great reasons.

February 2, 2024 · 11 min read

Di Qi

Di Qi

Cofounder

Learn

Improving vector search over documents using HyDE

The HyDE technique uses LLMs to improve the quality of vector search over text. In this article, we explain how it works and walk through an example with Lantern as a vector database.

January 22, 2024 · 7 min read

Di Qi

Di Qi

Cofounder

Danyil Blyschak

Danyil Blyschak

Software Engineer

Embedding·Learn

Picking the right embedding model for your vector database

Why choosing the right embedding model for vector search makes all the difference, and how to experiment with embedding models more effectively with Lantern

January 15, 2024 · 12 min read

Di Qi

Di Qi

Cofounder

Danyil Blyschak

Danyil Blyschak

Software Engineer

Pinecone·Migration·Product·Learn

Migrating from Pinecone to Lantern

Pinecone is a popular closed-source vector database. We built a Python library to support migrating data from Pinecone to Lantern. This article covers how we built it and how to use it.

January 6, 2024 · 6 min read

Di Qi

Di Qi

Cofounder

HNSW·Learn·Memory·Index·Quantization·Vector

Estimating memory footprint of your HNSW index

This interactive visualization will help you quickly reason about the resources necssary to host your embeddings and serve nearest neighbor queries over them

December 22, 2023 · 10 min read

Narek Galstyan

Narek Galstyan

Cofounder

Learn·Search·Postgres

Full Text Search + Vector Search with Postgres

Search is a common need for many applications. Postgres supports search out of the box with regex matching and full text search. This can be augmented with vector search.

December 16, 2023 · 11 min read

Di Qi

Di Qi

Cofounder

Learn·Vector·HNSW·Index

The Hierarchial Navigable Small Worlds (HNSW) Algorithm

An explanation of the Hierarchial Navigable Small Worlds (HNSW) Algorithm

November 19, 2023 · 4 min read

Di Qi

Di Qi

Cofounder

Learn·Embedding

Embeddings and choosing the right model

An overview of embeddings and what to consider when choosing an embedding model

November 13, 2023 · 6 min read

Di Qi

Di Qi

Cofounder

Benchmark·Product

90x faster than pgvector — Lantern's HNSW Index Creation Time

Index creation time is a critical database metric. Learn more about how Lantern enables 90x faster index creation times than pgvector, and how Lantern compares to Pinecone.

October 20, 2023 · 8 min read

Varik Matevosyan

Varik Matevosyan

Software Engineer

Company

Launching Lantern — a PostgreSQL vector database for building AI applications

Today we’re launching Lantern, an open-source PostgreSQL vector database.

September 13, 2023 · 3 min read

Narek Galstyan

Narek Galstyan

Cofounder