Enhancing GPT Conversations with Custom Context Management Techniques

Chatbot seamlessly handling a long conversation with multiple context techniques, surrounded by gears

Introduction

As AI-powered conversational agents become increasingly popular, it's essential to address the limitations of models like GPT, especially when it comes to maintaining context in long conversations. In this blog post, we'll explore four context management techniques—conversation summarization, selective context, context window, and adaptive context—and demonstrate how to implement them in Python using OpenAI's GPT and Pinecone. We'll also discuss the pros and cons of each technique, some missing aspects, and additional considerations.

Note: Code at the bottom of the post.

Context Management Techniques

Conversation Summarization: Generate a summary of the entire conversation history to provide a condensed context for the model.
Selective Context: Retrieve the most relevant messages from the conversation history using a vector index (powered by Pinecone).
Context Window: Use only the most recent messages in the conversation as context, ignoring earlier messages.
Adaptive Context: Automatically switch between summarization and selective context based on the conversation length.

Implementation

The GPTConversationManager class (refer to the code shared earlier) provides an example of how to implement these techniques in Python. The class utilizes OpenAI's GPT and Pinecone to manage conversation history and generate context-aware responses.

The class has separate methods for each context management technique:

ask_gpt_with_summary: Generates a summary of the conversation history and uses it as context.
ask_gpt_with_selective_context: Retrieves the most relevant messages from the conversation history using Pinecone's vector index and uses them as context.
ask_gpt_with_context_window: Takes a specified number of recent messages as context.
ask_gpt_with_adaptive_context: Automatically switches between summarization and selective context based on the conversation length.

Pros and Cons

Conversation Summarization:
- Pros: Provides a condensed overview of the conversation, reducing the risk of exceeding token limits.
- Cons: May lose important details or nuances from the original conversation.
Selective Context:
- Pros: Retrieves the most relevant messages from the conversation history, making the context more targeted and efficient.
- Cons: Relies on the accuracy of the vector index and may not always retrieve the most relevant messages.
Context Window:
- Pros: Simple to implement and ensures the model receives the most recent messages.
- Cons: May lose important context from earlier in the conversation.
Adaptive Context:
- Pros: Provides the best of both worlds, automatically switching between summarization and selective context as needed.
- Cons: May inherit the cons of both summarization and selective context techniques.

Missing Aspects and Further Improvements

Error handling: Implement error handling to manage unexpected situations.
User customization: Allow users to customize GPT engine settings.
Context relevance threshold: Set a relevance threshold for the selective context technique.
Conversation history persistence: Store conversation history in a file or database for persistence.
Multi-user support: Manage separate conversation histories for multiple users.
Handling out-of-context responses: Implement strategies to detect and handle irrelevant responses.
Adaptive context management: Develop a method to automatically choose the best context management technique.

Conclusion

By implementing various context management techniques, we can address the limitations of GPT models and improve the quality of AI-generated responses in long conversations. While each technique has its pros and cons, using a combination of techniques or the adaptive context management approach can result in a more robust and effective conversation manager. Further improvements can be made by addressing missing aspects and incorporating user feedback to enhance the overall user experience.

Code

import openai
import pinecone
from sentence_transformers import SentenceTransformer

class GPTConversationManager:
    def __init__(self, api_key, pinecone_api_key, index_name):
        self.api_key = api_key
        openai.api_key = self.api_key
        self.conversation_history = []
        self.pinecone_api_key = pinecone_api_key
        self.index_name = index_name
        pinecone.init(api_key=self.pinecone_api_key)
        self.embedder = SentenceTransformer("sentence-transformers/paraphrase-MiniLM-L6-v2")

    def add_message(self, message):
        self.conversation_history.append(message)
        embedding = self.embedder.encode(message).tolist()
        pinecone.upsert(index_name=self.index_name, ids=[str(len(self.conversation_history)-1)], vectors=[embedding])

    def get_summary(self, conversation):
        summary_prompt = f"Summarize the following conversation: {conversation}"
        summary_response = openai.Completion.create(
            engine="text-davinci-002",
            prompt=summary_prompt,
            max_tokens=50,
            n=1,
            stop=None,
            temperature=0.5,
        )
        return summary_response.choices[0].text.strip()

    def retrieve_relevant_messages(self, query, num_messages=5):
        query_embedding = self.embedder.encode(query).tolist()
        nearest_ids, _ = pinecone.fetch(index_name=self.index_name, query_vectors=[query_embedding], top_k=num_messages)
        relevant_messages = [self.conversation_history[int(idx)] for idx in nearest_ids[0]]
        return relevant_messages

    def generate_response(self, prompt, context=None):
        if context:
            full_prompt = f"{context}\n{prompt}"
        else:
            full_prompt = prompt

        response = openai.Completion.create(
            engine="text-davinci-002",
            prompt=full_prompt,
            max_tokens=100,
            n=1,
            stop=None,
            temperature=0.5,
        )
        return response.choices[0].text.strip()

    def ask_gpt_with_summary(self, prompt):
        conversation = ' '.join(self.conversation_history)
        summary = self.get_summary(conversation)
        response = self.generate_response(prompt, context=summary)
        self.add_message(response)
        return response

    def ask_gpt_with_selective_context(self, prompt):
        relevant_messages = self.retrieve_relevant_messages(prompt)
        context = ' '.join(relevant_messages)
        response = self.generate_response(prompt, context=context)
        self.add_message(response)
        return response

		def ask_gpt_with_adaptive_context(self, prompt, token_limit=4096):
		    conversation = ' '.join(self.conversation_history)
		    tokens_in_conversation = openai.api.num_tokens(conversation)
		
		    if tokens_in_conversation > token_limit:
		        # If the conversation is too long, use selective context
		        return self.ask_gpt_with_selective_context(prompt)
		    else:
		        # Otherwise, use conversation summarization
		        return self.ask_gpt_with_summary(prompt)
		
    def ask_gpt_with_context_window(self, prompt, window_size=5):
        context_window = ' '.join(self.conversation_history[-window_size:])
        response = self.generate_response(prompt, context=context_window)
        self.add_message(response)
        return response

    def __del__(self):
        pinecone.deinit()

# Usage example
gpt = GPTConversationManager("your_api_key", "your_pinecone_api_key", "conversation-embeddings")

# Add initial conversation history
gpt.add_message("User: I need some ideas for a website layout.")
gpt.add_message("GPT: I suggest having a clean and minimalistic layout with a clear navigation bar, large hero image, and clear call-to-action buttons.")

# Ask a follow-up question using different methods
response1 = gpt.ask_gpt_with_summary("User: What are some ideas for the navigation bar?")
response2 = gpt.ask_gpt_with_selective_context("User: Can you suggest some color schemes for the website?")
response3 = gpt.ask_gpt_with_context_window("User: What should I consider when choosing typography?")

print("Response using summarization:", response1)
print("Response using selective context:", response2)
print("Response using context window:", response3)

# Loop Example
try:
    print("Starting the conversation. Press Ctrl+C to stop.")
    while True:
        user_input = input("User: ")
        gpt.add_message(f"User: {user_input}")
        
        print("Select context management method:")
        print("1. Summarization")
        print("2. Selective context")
        print("3. Context window")
        choice = int(input("Enter the number of your choice: "))

        if choice == 1:
            response = gpt.ask_gpt_with_summary(user_input)
        elif choice == 2:
            response = gpt.ask_gpt_with_selective_context(user_input)
        elif choice == 3:
            response = gpt.ask_gpt_with_context_window(user_input)
        else:
            print("Invalid choice. Using summarization as default.")
            response = gpt.ask_gpt_with_summary(user_input)

        print("GPT:", response)

except KeyboardInterrupt:
    print("\nEnding the conversation.")