Blog

Topics include vector search algorithms, use cases and applications, tutorials, templates, performance, and benchmarks

LLM Completions in Postgres

With v0.5.0, we’re releasing llm_completion and add_completion_job, which enable LLM calls inside Postgres.

Di Qi

Di Qi

Cofounder

November 20, 2024 · 7 min read

A code chatbot in 15 minutes with Postgres

Learn how to build a powerful, scalable code chatbot using Lantern on Ubicloud in under 15 minutes. All open-source tooling, from embedding models to cloud providers.

Di Qi

Di Qi

Cofounder

October 24, 2024 · 7 min read

Bank Compliance Automation with Lantern and Ecliptor

In this article, we show how to use Lantern, Ecliptor, and OpenAI models to automate compliance checks in the banking industry.

Di Qi

Di Qi

Cofounder

October 21, 2024 · 9 min read

CREATE INDEX EXTERNALLY: Offloading pgvector Indexing from Postgres

We introduce external indexing for pgvector in Postgres to reduce performance issues from indexing large datasets by offloading the indexing process to external machines.

Varik Matevosyan

Varik Matevosyan

Software Engineer

October 1, 2024 · 6 min read

Understanding pgvector's HNSW Index Storage in Postgres

In this article, we'll explore how pgvector works under the hood, focusing on how the HNSW index is stored in Postgres.

Varik Matevosyan

Varik Matevosyan

Software Engineer

August 19, 2024 · 8 min read

Analysis of Key PostgreSQL Configuration Differences Among Cloud Providers

We dicuss some differences in PostgreSQL configuration parameters among cloud providers, why they matter, and their implications for users.

Narek Galstyan

Narek Galstyan

Cofounder

August 13, 2024 · 5 min read

Hybrid Vector Search in Postgres

Hybrid vector search combines the strengths of sparse and dense vector searches to improve search quality. We evaluate its performance on several datasets from the BEIR framework using Postgres.

Di Qi

Di Qi

Cofounder

August 5, 2024 · 8 min read

Postgres vs. Pinecone

We respond to Pinecone's recent blog post comparing Postgres and Pinecone. We show that Postgres can outperform Pinecone in the same benchmarks Pinecone covered in their article.

Narek Galstyan

Narek Galstyan

Cofounder

July 18, 2024 · 11 min read

When using Pinecone, do not rely on returned scores - recalculate distances locally

When benchmarking Pinecone against Postgres, we noticed some inconsistencies with returned results. We recommend recalculating and re-sorting returned results locally when using Pinecone for nearest neighbor vector queries.

Narek Galstyan

Narek Galstyan

Cofounder

July 17, 2024 · 4 min read

Dynamically loaded extensions in Postgres in the browser

We build on top of pglite to support loading Postgres extensions directly in the browser. We share a demo using geospatial search and vector search from Lantern, and discuss technical challenges we encountered.

Varik Matevosyan

Varik Matevosyan

Software Engineer

July 12, 2024 · 6 min read

Understanding Postgres Table Permissions

This blog post helps interpret Postgres's terse permission strings with examples and visuals.

Narek Galstyan

Narek Galstyan

Cofounder

July 10, 2024 · 7 min read

Use separate tables for asynchronous embedding generation

In this article, we'll explore a problem faced when extending traditional Postgres tables with an additional column for embeddings. We will describe a typical setting for asynchronous embedding generation and will show how table schema design can have a significant impact on storage and performance efficiency.

Narek Galstyan

Narek Galstyan

Cofounder

May 8, 2024 · 7 min read

Understanding and Estimating Write-Ahead Log (WAL) Size in Postgres

This blog post looks into the underlying plumbing of WAL records - when they are created, what information they contain, and how different database operations like indexing affect the kind of WAL records being created.

Narek Galstyan

Narek Galstyan

Cofounder

May 7, 2024 · 10 min read

April 2024 - Engineering Updates

Engineering updates for April 2024, including asynchronous tasks, weighted vector search, and more.

Di Qi

Di Qi

Cofounder

May 1, 2024 · 2 min read

Product Quantization in Postgres

We implemented product quantization in Lantern and benchmarked it using the LAION 100M 768-dimensional vector dataset.

Di Qi

Di Qi

Cofounder

Varik Matevosyan

Varik Matevosyan

Software Engineer

March 5, 2024 · 8 min read

Evaluating OpenAI's new embedding models with Lantern and Parea AI

OpenAI's newest embedding models promise huge performance increases. Using Lantern's Postgres vector database, and Parea AI's testing platform, we'll measure the new models in a real-world test.

Di Qi

Di Qi

Cofounder

February 29, 2024 · 6 min read

Vector databases explained

We give an overview of vector databases, and major concepts around them, including vector embeddings, vector indexing, and vector search.

Di Qi

Di Qi

Cofounder

March 29, 2024 · 12 min read

Postgres vs ElasticSearch vs Algolia - Comparing the Best Search Solutions

Don't overcomplicate things with an external search engine like ElasticSearch or Algolia. Postgres as your search engine and database makes things simple and scalable for a few great reasons.

Di Qi

Di Qi

Cofounder

February 2, 2024 · 11 min read

Improving vector search over documents using HyDE

The HyDE technique uses LLMs to improve the quality of vector search over text. In this article, we explain how it works and walk through an example with Lantern as a vector database.

Di Qi

Di Qi

Cofounder

Danyil Blyschak

Danyil Blyschak

Software Engineer

January 22, 2024 · 7 min read

Picking the right embedding model for your vector database

Why choosing the right embedding model for vector search makes all the difference, and how to experiment with embedding models more effectively with Lantern

Di Qi

Di Qi

Cofounder

Danyil Blyschak

Danyil Blyschak

Software Engineer

January 15, 2024 · 12 min read

Migrating from Pinecone to Lantern

Pinecone is a popular closed-source vector database. We built a Python library to support migrating data from Pinecone to Lantern. This article covers how we built it and how to use it.

Di Qi

Di Qi

Cofounder

January 6, 2024 · 6 min read

Estimating memory footprint of your HNSW index

This interactive visualization will help you quickly reason about the resources necssary to host your embeddings and serve nearest neighbor queries over them

Narek Galstyan

Narek Galstyan

Cofounder

December 22, 2023 · 10 min read

Full Text Search + Vector Search with Postgres

Search is a common need for many applications. Postgres supports search out of the box with regex matching and full text search. This can be augmented with vector search.

Di Qi

Di Qi

Cofounder

December 16, 2023 · 11 min read

The Hierarchial Navigable Small Worlds (HNSW) Algorithm

An explanation of the Hierarchial Navigable Small Worlds (HNSW) Algorithm

Di Qi

Di Qi

Cofounder

November 19, 2023 · 4 min read

Embeddings and choosing the right model

An overview of embeddings and what to consider when choosing an embedding model

Di Qi

Di Qi

Cofounder

November 13, 2023 · 6 min read

90x faster than pgvector — Lantern's HNSW Index Creation Time

Index creation time is a critical database metric. Learn more about how Lantern enables 90x faster index creation times than pgvector, and how Lantern compares to Pinecone.

Varik Matevosyan

Varik Matevosyan

Software Engineer

October 20, 2023 · 8 min read

Launching Lantern — a PostgreSQL vector database for building AI applications

Today we're launching Lantern, an open-source PostgreSQL vector database.

Narek Galstyan

Narek Galstyan

Cofounder

September 13, 2023 · 3 min read