💾Memory

Memory allows you to chat with AI as if AI has the memory of previous conversations.

Human: hi I am bob

AI: Hello Bob! It's nice to meet you. How can I assist you today?

Human: what's my name?

AI: Your name is Bob, as you mentioned earlier.

Under the hood, these conversations are stored in arrays or databases, and provided as context

to LLM. For example:

You are an assistant to a human, powered by a large language model trained by OpenAI.

Whether the human needs help with a specific question or just wants to have a conversation about a particular topic, you are here to assist.

Current conversation:

{history}

Separate conversations for multiple users

UI & Embedded Chat

By default, UI and Embedded Chat will automatically separate different users conversations. This is done by generating a unique chatId for each new interaction. That logic is handled under the hood by THub.

Prediction API

You can separate the conversations for multiple users by specifying a unique sessionId

1. For every memory node, you should be able to see a input parameter Session ID

In the /api/v1/prediction/{your-chatflowid} POST body request, specify the sessionId in overrideConfig

{

"question": "hello!",

"overrideConfig": {

"sessionId": "user1"

}

}

Message API

· GET /api/v1/chatmessage/{your-chatflowid}

· DELETE /api/v1/chatmessage/{your-chatflowid}

All conversations can be visualized and managed from UI as well:

For OpenAI Assistant, Threads will be used to store conversations.

1)Buffer Memory

Use THub database table chat_message as the storage mechanism for storing/retrieving conversations.

2)Buffer Window Memory

Use THub database table chat_message as the storage mechanism for storing/retrieving conversations.

Difference being it only fetches the last K interactions. This approach is beneficial for preserving a sliding window of the most recent interactions, ensuring the buffer remains manageable in size.

3)Conversation Summary Memory

Use THub database table chat_message as the storage mechanism for storing/retrieving conversations.

This memory type creates a brief summary of the conversation over time. This is useful for shortening information from long discussions. It updates and saves a current summary as the conversation goes on. This is especially helpful in longer chats, where saving every past message would take up too much space.

4)Conversation Summary Buffer Memory

Use THub database table chat_message as the storage mechanism for storing/retrieving conversations.

This memory keeps a buffer of recent interactions and compiles old ones into a summary, using both in its storage. Instead of flushing old interactions based solely on their number, it now considers the total length of tokens to decide when to clear them out.

5)DynamoDB Chat Memory

Stores the conversation in dynamo db table.

6)MongoDB Atlas Chat Memory

Stores the conversation in MongoDB Atlas.

7)Redis-Backed Chat Memory

Summarizes the conversation and stores the memory in Redis server.

8)Upstash Redis-Backed Chat Memory

Summarizes the conversation and stores the memory in Upstash Redis server.

9)Zep Memory

Zep is long-term memory store for LLM applications. It stores, summarizes, embeds, indexes, and enriches LLM app / chatbot histories, and exposes them via simple, low-latency APIs.

Guide to Deploy Zep to Render

You can easily deploy Zep to cloud services like Render, Flyio. If you prefer to test it locally, you can also spin up a docker container by following their quick guide.

In this example, we are going to deploy to Render.

1. Head over to Zep Repo and click Deploy to Render

2. This will bring you to Render's Blueprint page and simply click Create New Resources

3.When the deployment is done, you should see 3 applications created on your dashboard.

4.Simply click the first one called zep and copy the deployed URL

Guide to Deploy Zep to Digital Ocean (via Docker)

Use in THub UI

1. Back to THub application, simply create a new canvas or use one of the templates from marketplace. In this example, we are going to use Simple Conversational Chain

2. Replace Buffer Memory with Zep Memory. Then replace the Base URL with the Zep URL you have copied above

3. Save the chatflow and test it out to see if conversations are remembered.

4. Now try clearing the chat history, you should see that it is now unable to remember the previous conversations.

Zep Authentication

Zep allows you to secure your instance using JWT authentication. We'll be using the zepcli command line utility here.

1. Generate a secret and the JWT token

After downloaded the ZepCLI:

You will first get your SECRET Token:

Then you will get JWT Token:

2. Configure Auth environment variables

Set the following environment variables in your Zep server environment:

1. Configure Credential on THub

Add a new credential for Zep, and put in the JWT Token in the API Key field:

3. Use the created credential on Zep node

In the Zep node Connect Credential, select the credential you have just created. And that's it!

Threads

Threds is only used when an OpenAI Assistant is being used. It is a conversation session between an Assistant and a user.Threads store messages and automatically handle truncation to fit content into a model’s context.

Separate conversations for multiple users

UI & Embedded Chat

By default, UI and Embedded Chat will automatically separate threads for multiple users conversations. This is done by generating a unique chatId for each new interaction.That logic is handled under the hood by THub.

Prediction API

POST /api/v1/prediction/{your-chatflowid}, specify the chatId . Same thread will be used for

the same chatId

Message API

· GET /api/v1/chatmessage/{your-chatflowid}

· DELETE /api/v1/chatmessage/{your-chatflowid}

You can also filter via chatId - /api/v1/chatmessage/{your-chatflowid}?chatId={your-chatid}

All conversations can be visualized and managed from UI as well:

Last updated