🧬Embeddings

Embeddings can be used to create a numerical representation of textual data. This numerical representation is useful because it can be used to find similar documents.

An embedding is a vector (list) of floating-point numbers. The distance between two vectors measures their relatedness. Small distances suggest high relatedness and large distances suggest low relatedness.

They are commonly used for:

· Search (where results are ranked by relevance to a query string)

· Clustering (where text strings are grouped by similarity)

· Recommendations (where items with related text strings are recommended)

· Anomaly detection (where outliers with little relatedness are identified)

· Diversity measurement (where similarity distributions are analyzed)

· Classification (where text strings are classified by their most similar label)

1)AWS Bedrock Embeddings

AWSBedrock embedding models to generate embeddings for a given text.

2)Azure OpenAI Embeddings

Prerequisite

1. Log in or sign up to Azure

2. Create your Azure OpenAI and wait for approval approximately 10 business days

3. Your API key will be available at Azure OpenAI > click name_azure_openai > click Click here to manage keys

Setup

Azure OpenAI Embeddings

1. Click Go to Azure OpenaAI Studio

2. Click Deployments

3. Click Create new deployment

4. Select as shown below and click Create

5. Successfully created Azure OpenAI Embeddings

· Deployment name: text-embedding-ada-002

· Instance name: top right conner

THub

1. Embeddings > drag Azure OpenAI Embeddings node

2. Connect Credential > click Create New

3. Copy & Paste each details (API Key, Instance & Deployment name, API Version) into Azure OpenAI Embeddings credential

4. Voila 🎉, you have created Azure OpenAI Embeddings node in THub

3)Cohere Embeddings

Cohere API to generate embeddings for a given text

4)Google GenerativeAI Embeddings

Google Generative API to generate embeddings for a given text.

5)Google PaLM Embeddings

Google MakerSuite PaLM API to generate embeddings for a given text.

6)Google VertexAI Embeddings

Google vertexAI API to generate embeddings for a given text.

7)HuggingFace Inference Embeddings

HuggingFace Inference API to generate embeddings for a given text.

8)LocalAI Embeddings

LocalAI Setup

LocalAI is a drop-in replacement REST API that’s compatible with OpenAI API specifications for local inferencing. It allows you to run LLMs (and not only) locally or on-prem with consumer grade hardware, supporting multiple model families that are compatible with the ggml format.

To use LocalAI Embeddings within THub, follow the steps below:

1. git clone https://github.com/go-skynet/LocalAI

2. cd LocalAI

LocalAI provides an API endpoint to download/install the model. In this example, we are going to use BERT Embeddings model:

4. In the /models folder, you should be able to see the downloaded model in there:

5. You can now test the embeddings:

curl http://localhost:8080/v1/embeddings -H "Content-Type: application/json" -d '{

"input": "Test",

"model": "text-embedding-ada-002"

}'

6. Response should look like:

Setup

Drag and drop a new LocalAIEmbeddings component to canvas:

Fill in the fields:

· Base Path: The base url from LocalAI such as http://localhost:8080/v1

· Model Name: The model you want to use. Note that it must be inside /model folder of LocalAI directory. For instance: text-embedding-ada-002

·

·That's it! For more information, refer to LocalAI docs.

9)MistralAI Embeddings

MistralAI API to generate embeddings for a given text.

10)Ollama Embeddings

Generate embeddings for a given text using opensource model on Ollama.

11)OpenAI Embeddings

OpenAI API to generate embeddings for a given text.

12)OpenAI Embeddings Custom

OpenAI API to generate embeddings for a given text.

13)TogetherAI Embedding

TogetherAI Embedding models to generate embeddings for a given text.

14)VoyageAI Embeddings

Voyage AI API to generate embeddings for a given text.

Last updated