🧬Embeddings
Embeddings can be used to create a numerical representation of textual data. This numerical representation is useful because it can be used to find similar documents.
Last updated
Embeddings can be used to create a numerical representation of textual data. This numerical representation is useful because it can be used to find similar documents.
Last updated
An embedding is a vector (list) of floating-point numbers. The distance between two vectors measures their relatedness. Small distances suggest high relatedness and large distances suggest low relatedness.
They are commonly used for:
· Search (where results are ranked by relevance to a query string)
· Clustering (where text strings are grouped by similarity)
· Recommendations (where items with related text strings are recommended)
· Anomaly detection (where outliers with little relatedness are identified)
· Diversity measurement (where similarity distributions are analyzed)
· Classification (where text strings are classified by their most similar label)
AWSBedrock embedding models to generate embeddings for a given text.
Prerequisite
2. Create your Azure OpenAI and wait for approval approximately 10 business days
3. Your API key will be available at Azure OpenAI > click name_azure_openai > click Click here to manage keys
Setup
Azure OpenAI Embeddings
1. Click Go to Azure OpenaAI Studio
2. Click Deployments
3. Click Create new deployment
4. Select as shown below and click Create
5. Successfully created Azure OpenAI Embeddings
· Deployment name: text-embedding-ada-002
· Instance name: top right conner
THub
1. Embeddings > drag Azure OpenAI Embeddings node
2. Connect Credential > click Create New
3. Copy & Paste each details (API Key, Instance & Deployment name, API Version) into Azure OpenAI Embeddings credential
4. Voila 🎉, you have created Azure OpenAI Embeddings node in THub
Cohere API to generate embeddings for a given text
Google Generative API to generate embeddings for a given text.
Google MakerSuite PaLM API to generate embeddings for a given text.
Google vertexAI API to generate embeddings for a given text.
HuggingFace Inference API to generate embeddings for a given text.
LocalAI Setup
LocalAI is a drop-in replacement REST API that’s compatible with OpenAI API specifications for local inferencing. It allows you to run LLMs (and not only) locally or on-prem with consumer grade hardware, supporting multiple model families that are compatible with the ggml format.
To use LocalAI Embeddings within THub, follow the steps below:
1. git clone https://github.com/go-skynet/LocalAI
2. cd LocalAI
LocalAI provides an API endpoint to download/install the model. In this example, we are going to use BERT Embeddings model:
4. In the /models
folder, you should be able to see the downloaded model in there:
5. You can now test the embeddings:
curl http://localhost:8080/v1/embeddings -H "Content-Type: application/json" -d '{
"input": "Test",
"model": "text-embedding-ada-002"
}'
6. Response should look like:
Setup
Drag and drop a new LocalAIEmbeddings component to canvas:
Fill in the fields:
· Base Path: The base url from LocalAI such as http://localhost:8080/v1
· Model Name: The model you want to use. Note that it must be inside /model folder of LocalAI directory. For instance: text-embedding-ada-002
·
·That's it! For more information, refer to LocalAI docs.
MistralAI API to generate embeddings for a given text.
Generate embeddings for a given text using opensource model on Ollama.
OpenAI API to generate embeddings for a given text.
OpenAI API to generate embeddings for a given text.
TogetherAI Embedding models to generate embeddings for a given text.
Voyage AI API to generate embeddings for a given text.