Develop
Generate Embeddings
Lantern supports generating text and image embeddings inside the database. Try it out on Lantern Cloud.
Note that generating embeddings is a compute-intensive task. For large scale embedding generation, such as generating embeddings over all of your data, Lantern provides a separate process.
Open AI Text Embeddings
Before using Open AI text embeddings, you need to have an Open AI API key. You can get one by signing up at Open AI. Once you have an API key, set it as a parameter in Postgres.
ALTER ROLE [YOUR_USERNAME] SET lantern_extras.openai_token='[YOUR_API_KEY]';
SELECT pg_reload_conf();
Use the openai_embedding
function to generate text embeddings using the Open AI embedding models. This function accepts a model name and text input as arguments, and for the text-embedding-3-small
and text-embedding-3-large
models, an optional dimension argument.
SELECT openai_embedding('openai/text-embedding-ada-002', 'My text input');
SELECT openai_embedding('openai/text-embedding-3-large', 'My text input');
SELECT openai_embedding('openai/text-embedding-3-large', 'My text input', 256);
The following embedding models are supported
Model Name | Dimensions | Max Tokens |
---|---|---|
| 1536 | 8192 |
| 512 - 1536 | 8192 |
| 256 - 3072 | 8192 |
Cohere Text Embeddings
Before using Cohere text embeddings, you need to have a Cohere API key. You can get one by signing up at Cohere. Once you have an API key, set it as a parameter in Postgres.
ALTER ROLE [YOUR_USERNAME] SET lantern_extras.cohere_token='[YOUR_API_KEY]';
SELECT pg_reload_conf();
Use the cohere_embedding
function to generate text embeddings using the Cohere embedding models. This function accepts a model name and text input as arguments, and an optional input type argument with values search_document
or search_query
(default is search_query
).
SELECT cohere_embedding('cohere/embed-english-v3.0', 'My text input');
SELECT cohere_embedding('cohere/embed-english-v3.0', 'My text input', 'search_document');
The following embedding models are supported
Model Name | Dimensions | Max Tokens |
---|---|---|
| 1024 | 512 |
| 1024 | 512 |
| 4096 | 512 |
| 1024 | 512 |
| 768 | 512 |
| 384 | 512 |
| 384 | 512 |
Open-Source Text Embeddings
For example, to generate an embedding for the text My text input
using the open-source embedding model BAAI/bge-small-en
in SQL, run
SELECT text_embedding('BAAI/bge-small-en', 'My text input');
The following embedding models are supported
Model Name | Dimensions | Max Tokens |
---|---|---|
| 512 | 77 |
| 768 | 128 |
| 384 | 128 |
| 768 | 250 |
| 768 | 128 |
| 1024 | 128 |
| 1024 | 512 |
| 768 | 512 |
| 1024 | 512 |
| 384 | 512 |
| 768 | 512 |
| 1024 | 512 |
| 1024 | 8192 |
| 512 | 8192 |
| 768 | 8192 |
Image Embeddings
To generate image embeddings, use the image_embedding
function. This function accepts a model name and image URL as arguments.
For example, to generate an embedding for the image https://lantern.dev/images/home/footer.png
using the embedding model clip/ViT-B-32-visual
, run
SELECT image_embedding('clip/ViT-B-32-visual', 'https://lantern.dev/images/home/footer.png');
The following embedding models are supported
Model Name | Dimensions | Max Tokens |
---|---|---|
| 512 | 224 |
Self-Hosting
For people self-hosting, generating embeddings requires the Lantern Extras extension. Installation steps are found here.
Once the extension is installed, the above functions are available.