We're excited to release llm_completion
and add_completion_job
, which enable LLM calls inside Postgres. This enables LLM completion calls for adhoc queries, as well as bulk embedding generation for entire columns.
It's as simple as calling
SELECT llm_completion('Summarize the text', 'def hello_world(): print("Hello world")');
SELECT add_completion_job('articles', 'full_text', 'summary', 'Summarize the text');
We support OpenAI LLM models (including GPT-4o, GPT-4o mini, o1-preview, and o1-mini).
We also support any custom LLM deployment. Previously, we showed how to build a code chatbot in 15 minutes with Ubicloud and open-source embedding models and LLM models.
Why LLM completions in SQL?
There's a lot of use cases for LLMs in Postgres. You can use them to fill in data, summarize existing data, or extract structured data from unstructured text, all within your existing database.
Summarize the contents of a column:
SELECT id, llm_completion('Summarize the following text: ' || full_text) AS summary FROM documents;
Fill in a new column using existing data:
UPDATE products SET description = llm_completion('Generate a product description for: ' || name) FROM products WHERE description IS NULL;
Extract structured data from an existing, unstructured text column:
SELECT llm_completion('Extract the key entities and return them as a JSON text array from: ' || notes) AS entities FROM meeting_notes;
The API for adhoc LLM completion is simple.
SELECT llm_completion(
user_prompt => 'User input', -- User prompt to LLM model
model => 'gpt-4o', -- Model for runtime to use (default: 'gpt-4o')
system_prompt => 'Provide short summary for the given text', -- System prompt for LLM (default: '')
base_url => 'https://api.openai.com', -- If you have custom LLM deployment provide the server url. (default: OpenAI API URL)
api_token => '<llm_api_token>', -- API token for LLM server. (default: inferred from lantern_extras.llm_token GUC)
azure_entra_token => '', -- If this is Azure deployment it supports Auth with entra token too
runtime => 'openai' -- Runtime to use. (default: 'openai'). Use `SELECT get_available_runtimes()` for list
);
Bulk LLM completions with add_completion_job
While llm_completion
is perfect for adhoc LLM calls or processing a few rows, handling LLM calls over large tables benefits from a more efficient approach. LLMs can be time-consuming, taking seconds per call. Running them sequentially over thousands of rows is impractical—it could take hours for even 1,000 rows.
For these cases, you can use add_completion_job
to parallelize your LLM calls over table rows. OpenAI supports up to 30,000 calls per minute for its LLM models, allowing you to process large volumes of data quickly in parallel.
SELECT add_completion_job(
table_name => 'articles', -- Name of the table
src_column => 'content', -- Source column for embeddings
dst_column => 'content_summary', -- Destination column for llm response (will be created automatically)
system_prompt => 'Provide short summary for the given text', -- System prompt for LLM (default: '')
column_type => 'TEXT', -- Destination column type
model => 'gpt-4o', -- Model for runtime to use (default: 'gpt-4o')
pk => 'id', -- Primary key of the table. It is required for table to have primary key (default: id)
schema => 'public', -- Schema on which the table is located (default: 'public')
base_url => 'https://api.openai.com', -- If you have custom LLM deployment provide the server url. (default: OpenAi API URL)
batch_size => 10, -- Batch size for the inputs to use when requesting LLM server. This is based on your API tier. (default: determined based on model and runtime)
api_token => '<llm_api_token>', -- API token for LLM server. (default: inferred from lantern_extras.llm_token GUC)
azure_entra_token => '', -- If this is Azure deployment it supports Auth with entra token too
runtime => 'openai' -- Runtime to use. (default: 'openai'). Use `SELECT get_available_runtimes()` for list
);
For example, to classify customer feedback in bulk:
SELECT add_completion_job('user_feedback_responses', 'user_response', 'sentiment_score', 'You will receive user feedback. Return 1 if the response is positive, -1 if negative, or 0 if neutral.');
This schedules a background job to process each row without blocking your main operations.
Keep track of your completion jobs with:
SELECT * FROM get_completion_jobs();
Embedding Generation
LLM completions pairs nicely with embedding generation for retrieval augmented generation (RAG) use cases. Embeddings convert text into vectors, which enable efficient similarity search over unstructured data. The embedding generation API in Lantern is analogous to the API for LLM completions.
For one off embedding generation, use llm_embedding
:
SELECT llm_embedding(
input => 'User input', -- User prompt to LLM model
model => 'gpt-4o', -- Model for runtime to use (default: 'gpt-4o')
base_url => 'https://api.openai.com', -- If you have custom LLM deployment provide the server url. (default: OpenAi API URL)
api_token => '<llm_api_token>', -- API token for LLM server. (default: inferred from lantern_extras.llm_token GUC)
azure_entra_token => '', -- If this is Azure deployment it supports Auth with entra token too
dimensions => 1536, -- For new generation OpenAi models you can provide dimensions for returned embeddings. (default: 1536)
input_type => 'search_query', -- Needed only for cohere runtime to indicate if this input is for search or storing. (default: 'search_query'). Can also be 'search_document'
runtime => 'openai' -- Runtime to use. (default: 'openai'). Use `SELECT get_available_runtimes()` for list
);
For bulk embedding generation, use add_embedding_job
. This ensures the embeddings are generated in parallel.
SELECT add_embedding_job(
table => 'articles', -- Name of the table
src_column => 'content', -- Source column for embeddings
dst_column => 'content_embedding', -- Destination column for embeddings (will be created automatically)
model => 'text-embedding-3-small', -- Model for runtime to use (default: 'text-embedding-3-small')
pk => 'id', -- Primary key of the table. It is required for table to have primary key (default: id)
schema => 'public', -- Schema on which the table is located (default: 'public')
base_url => 'https://api.openai.com', -- If you have custom LLM deployment provide the server url. (default: OpenAi API URL)
batch_size => 500, -- Batch size for the inputs to use when requesting LLM server. This is based on your API tier. (default: determined based on model and runtime)
dimensions => 1536, -- For new generation OpenAi models you can provide dimensions for returned embeddings. (default: 1536)
api_token => '<llm_api_token>', -- API token for LLM server. (default: inferred from lantern_extras.llm_token GUC)
azure_entra_token => '', -- If this is Azure deployment it supports Auth with entra token too
runtime => 'openai' -- Runtime to use. (default: 'openai'). Use `SELECT get_available_runtimes()` for list
);
To monitor your embedding jobs, you can query
SELECT * FROM get_embedding_jobs();
Pair LLM completion with embedding generation for RAG applications
Combining embedding generation and LLM completions in Postgres allows you to perform the entire RAG workflow directly within Postgres. For example, to find the most relevant user bios and generate a coherent answer:
WITH contexts AS (
SELECT
name,
experience
FROM
company_employees
ORDER BY
vector <-> llm_embedding('openai/text-embedding-3-small', 'LLMs')
LIMIT 10
)
SELECT
llm_completion(
STRING_AGG(name || ' (' || experience || ')', ', '),
'Who should I go to for advice on LLMs? Here are some options.'
)
FROM contexts
This query retrieves the 10 most relevant users and uses the LLM to generate a consolidated answer.
Try it out for free in Lantern Cloud
LLM completions and embedding generation are both available in Lantern Cloud. Getting started is easy.
-
Create a free account on Lantern Cloud.
-
Connect to your database and enable LLM completions:
sqlCopyALTER SYSTEM SET lantern_extras.enable_daemon=true; SELECT pg_reload_conf(); ALTER DATABASE postgres SET lantern_extras.llm_token='YOUR OPENAI API KEY';
-
Start using LLM and embedding functions
We'd love to hear what you think! Feel free to send any feedback to support@lantern.dev.