Spring AI Text Classification — Sentiment Analysis, Topic Detection, and Content Moderation
Text classification is one of the most common production AI use cases: categorizing support tickets, detecting sentiment in reviews, moderating user content, routing emails. Spring AI makes it trivial to build high-accuracy classifiers that outperform traditional ML models without any training data or model fine-tuning.
Why AI-Based Classification Beats Traditional ML
Traditional ML classifier:
→ Requires labeled training dataset (thousands of examples)
→ Takes days to train and validate
→ Needs retraining when categories change
→ Struggles with nuance and sarcasm
GPT-4o-mini classifier:
→ Works immediately with zero training data
→ Add/change categories in 5 minutes (just change the prompt)
→ Handles sarcasm, context, and unusual phrasing naturally
→ 90%+ accuracy on most classification tasks
→ Cost: ~$0.001 per 1000 classifications
Basic Sentiment Classifier
@Service
public class SentimentClassifier {
private final ChatClient chatClient;
public SentimentClassifier(ChatClient.Builder builder) {
this.chatClient = builder
.defaultOptions(OpenAiChatOptions.builder()
.withModel("gpt-4o-mini")
.withTemperature(0.0f) // deterministic
.build())
.build();
}
public Sentiment classifySentiment(String text) {
String result = chatClient.prompt()
.system("""
Classify the sentiment of the given text.
Output ONLY one of: POSITIVE, NEGATIVE, NEUTRAL, MIXED
Do not explain. Output the label only.
""")
.user(text)
.call()
.content()
.trim()
.toUpperCase();
try {
return Sentiment.valueOf(result);
} catch (IllegalArgumentException e) {
return Sentiment.NEUTRAL; // default on parse failure
}
}
public SentimentDetail classifyWithScore(String text) {
return chatClient.prompt()
.system("Classify sentiment. Return JSON with label and confidence 0.0-1.0")
.user(text)
.call()
.entity(SentimentDetail.class);
}
}
enum Sentiment { POSITIVE, NEGATIVE, NEUTRAL, MIXED }
record SentimentDetail(String label, double confidence, String reason) {}
Multi-Label Topic Classifier
@Service
public class TopicClassifier {
private final ChatClient chatClient;
private static final List<String> TOPICS = List.of(
"billing", "technical_issue", "feature_request",
"complaint", "general_inquiry", "refund_request",
"account_access", "shipping"
);
public TopicClassifier(ChatClient.Builder builder) {
this.chatClient = builder
.defaultOptions(OpenAiChatOptions.builder()
.withModel("gpt-4o-mini")
.withTemperature(0.0f)
.build())
.build();
}
// Single-label classification
public String classify(String text) {
return chatClient.prompt()
.user("""
Classify this customer message into ONE category.
Categories: %s
Message: %s
Output ONLY the category name.
""".formatted(String.join(", ", TOPICS), text))
.call()
.content()
.trim()
.toLowerCase();
}
// Multi-label — message can belong to multiple categories
public List<String> classifyMultiLabel(String text) {
String result = chatClient.prompt()
.user("""
Which categories apply to this message? (can be multiple)
Categories: %s
Message: %s
Output as comma-separated list. Example: billing,complaint
""".formatted(String.join(", ", TOPICS), text))
.call()
.content()
.trim();
return Arrays.stream(result.split(","))
.map(String::trim)
.map(String::toLowerCase)
.filter(TOPICS::contains)
.toList();
}
}
Content Moderation Service
public record ModerationResult(
boolean safe,
List<String> violations, // e.g., ["spam", "harassment"]
double toxicityScore, // 0.0-1.0
String action // "allow", "flag_for_review", "block"
) {}
@Service
public class ContentModerationService {
private final ChatClient chatClient;
public ContentModerationService(ChatClient.Builder builder) {
this.chatClient = builder
.defaultOptions(OpenAiChatOptions.builder()
.withModel("gpt-4o-mini")
.withTemperature(0.0f)
.build())
.build();
}
public ModerationResult moderate(String userContent) {
ModerationResult result = chatClient.prompt()
.system("""
You are a content moderation system.
Analyze the content for: spam, harassment, hate_speech,
violence, adult_content, misinformation, self_harm.
Be accurate — false positives harm legitimate users.
""")
.user("Moderate this content: " + userContent)
.call()
.entity(ModerationResult.class);
// Override action based on score
if (result.toxicityScore() > 0.9) {
return new ModerationResult(false, result.violations(),
result.toxicityScore(), "block");
} else if (result.toxicityScore() > 0.5) {
return new ModerationResult(false, result.violations(),
result.toxicityScore(), "flag_for_review");
}
return result;
}
}
Batch Classification
@Service
public class BatchClassificationService {
private final SentimentClassifier sentimentClassifier;
private final TopicClassifier topicClassifier;
// Classify multiple texts efficiently using virtual threads
public Map<String, ClassificationResult> classifyBatch(List<String> texts) {
Map<String, ClassificationResult> results = new ConcurrentHashMap<>();
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
List<Future<?>> futures = texts.stream()
.map(text -> executor.submit(() -> {
Sentiment sentiment = sentimentClassifier.classifySentiment(text);
String topic = topicClassifier.classify(text);
results.put(text, new ClassificationResult(sentiment, topic));
}))
.toList();
futures.forEach(f -> {
try { f.get(); } catch (Exception e) {
System.err.println("Classification failed: " + e.getMessage());
}
});
}
return results;
}
}
record ClassificationResult(Sentiment sentiment, String topic) {}
Output
// Sentiment classification
classifySentiment("The product quality is amazing, exceeded my expectations!")
→ POSITIVE
classifySentiment("It works but the price is way too high for what you get")
→ MIXED
// Topic classification
classify("I've been charged twice for the same order")
→ billing
classifyMultiLabel("I can't log in and I need a refund for my subscription")
→ ["account_access", "refund_request"]
// Content moderation
moderate("Buy cheap products now!!! Click here http://spam.com")
→ ModerationResult[safe=false, violations=["spam"], toxicityScore=0.85, action="block"]
moderate("How do I reset my password?")
→ ModerationResult[safe=true, violations=[], toxicityScore=0.02, action="allow"]
Key Points
- Always set
temperature=0.0for classification — you want the same input to always produce the same label - Use
gpt-4o-minifor classification — it's accurate for most categories and costs 20x less thangpt-4o - Request the model to output only the label (not an explanation) — it reduces tokens and eliminates parsing complexity
- Use Java virtual threads (
Executors.newVirtualThreadPerTaskExecutor()) for batch classification — each blocking AI call runs in its own virtual thread at minimal overhead - Add a fallback enum value (NEUTRAL, "unknown") and catch
IllegalArgumentExceptionon enum parsing — AI occasionally returns unexpected variations
Comments