Spring AI Chat Memory — Build Stateful Multi-Turn Conversations
By default, every call to an LLM is stateless — the model has no memory of previous messages. To build a real chatbot where the AI remembers what was said earlier, you must maintain conversation history. Spring AI provides the MessageChatMemoryAdvisor to handle this automatically, storing and replaying chat history on every turn.
How Chat Memory Works
Turn 1: User → "My name is Ravi"
AI → "Nice to meet you, Ravi!"
Memory stores: [User: "My name is Ravi", Assistant: "Nice to meet you, Ravi!"]
Turn 2: User → "What is my name?"
Memory replays history first, then adds new question:
[System: ..., User: "My name is Ravi", Assistant: "Nice to meet you, Ravi!", User: "What is my name?"]
AI → "Your name is Ravi." ← knows from history
Maven Dependency
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>
ChatMemoryService.java — In-Memory Store
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.client.advisor.MessageChatMemoryAdvisor;
import org.springframework.ai.chat.memory.InMemoryChatMemory;
import org.springframework.stereotype.Service;
@Service
public class ChatMemoryService {
private final ChatClient chatClient;
private final InMemoryChatMemory memory;
public ChatMemoryService(ChatClient.Builder builder) {
this.memory = new InMemoryChatMemory();
this.chatClient = builder
.defaultSystem("You are a helpful Java programming assistant.")
.defaultAdvisors(new MessageChatMemoryAdvisor(memory))
.build();
}
public String chat(String sessionId, String userMessage) {
return chatClient.prompt()
.user(userMessage)
.advisors(a -> a.param(
MessageChatMemoryAdvisor.CHAT_MEMORY_CONVERSATION_ID_KEY,
sessionId
))
.call()
.content();
}
public void clearSession(String sessionId) {
memory.clear(sessionId);
}
}
ChatMemoryController.java
@RestController
@RequestMapping("/chat")
public class ChatMemoryController {
private final ChatMemoryService chatService;
public ChatMemoryController(ChatMemoryService chatService) {
this.chatService = chatService;
}
@PostMapping("/{sessionId}")
public String chat(@PathVariable String sessionId,
@RequestBody String message) {
return chatService.chat(sessionId, message);
}
@DeleteMapping("/{sessionId}")
public String clear(@PathVariable String sessionId) {
chatService.clearSession(sessionId);
return "Session cleared";
}
}
Conversation Demo — Consecutive Calls
// POST /chat/user123 body: "My name is Ravi and I work with Spring Boot"
Response: "Nice to meet you, Ravi! Spring Boot is a great framework. How can I help you today?"
// POST /chat/user123 body: "What frameworks do I work with?"
Response: "You mentioned you work with Spring Boot."
// POST /chat/user123 body: "Explain @Transactional to me"
Response: "@Transactional marks a method so Spring wraps it in a database transaction..."
// POST /chat/user123 body: "Give me an example of what we just discussed"
Response: "Sure! Here's an example of @Transactional in Spring Boot:
@Service
public class UserService {
@Transactional
public void saveUser(User user) { ... }
}"
Window-Based Memory — Limit History Size
Too much history increases token usage and cost. Use a sliding window to keep only the last N messages:
import org.springframework.ai.chat.client.advisor.MessageChatMemoryAdvisor;
import org.springframework.ai.chat.memory.InMemoryChatMemory;
@Service
public class WindowedChatService {
private final ChatClient chatClient;
private final InMemoryChatMemory memory;
public WindowedChatService(ChatClient.Builder builder) {
this.memory = new InMemoryChatMemory();
this.chatClient = builder
.defaultSystem("You are a Java expert assistant.")
// Keep last 20 messages (10 exchanges) in context
.defaultAdvisors(new MessageChatMemoryAdvisor(memory, "default", 20))
.build();
}
public String chat(String sessionId, String message) {
return chatClient.prompt()
.user(message)
.advisors(a -> a.param(
MessageChatMemoryAdvisor.CHAT_MEMORY_CONVERSATION_ID_KEY, sessionId))
.call()
.content();
}
}
Persistent Chat Memory with Redis
<!-- pom.xml -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>
import org.springframework.ai.chat.memory.ChatMemory;
import org.springframework.ai.chat.messages.Message;
import org.springframework.data.redis.core.RedisTemplate;
@Component
public class RedisChatMemory implements ChatMemory {
private final RedisTemplate<String, List<Message>> redisTemplate;
private static final String PREFIX = "chat:";
private static final Duration TTL = Duration.ofHours(24);
public RedisChatMemory(RedisTemplate<String, List<Message>> redisTemplate) {
this.redisTemplate = redisTemplate;
}
@Override
public void add(String conversationId, List<Message> messages) {
String key = PREFIX + conversationId;
List<Message> existing = get(conversationId, Integer.MAX_VALUE);
existing.addAll(messages);
redisTemplate.opsForValue().set(key, existing, TTL);
}
@Override
public List<Message> get(String conversationId, int lastN) {
String key = PREFIX + conversationId;
List<Message> all = redisTemplate.opsForValue().get(key);
if (all == null) return new ArrayList<>();
int from = Math.max(0, all.size() - lastN);
return new ArrayList<>(all.subList(from, all.size()));
}
@Override
public void clear(String conversationId) {
redisTemplate.delete(PREFIX + conversationId);
}
}
Key Points
MessageChatMemoryAdvisorintercepts every call, prepends history, and saves the new exchange automatically- Each
sessionId/conversationIdmaintains its own independent history - Use
InMemoryChatMemoryfor development and testing; replace with a persistent store (Redis, DB) for production - Window-based memory limits context size to control costs — keep the last 10–20 messages for most use cases
- Always call
clearSession()when a user logs out or starts a new topic to prevent stale context
Comments