Java9R: Spring AI Text Classification — Sentiment Analysis, Topic Detection, and Content Moderation

Spring AI Text Classification — Sentiment Analysis, Topic Detection, and Content Moderation

Text classification is one of the most common production AI use cases: categorizing support tickets, detecting sentiment in reviews, moderating user content, routing emails. Spring AI makes it trivial to build high-accuracy classifiers that outperform traditional ML models without any training data or model fine-tuning.

Why AI-Based Classification Beats Traditional ML

Traditional ML classifier:
  → Requires labeled training dataset (thousands of examples)
  → Takes days to train and validate
  → Needs retraining when categories change
  → Struggles with nuance and sarcasm

GPT-4o-mini classifier:
  → Works immediately with zero training data
  → Add/change categories in 5 minutes (just change the prompt)
  → Handles sarcasm, context, and unusual phrasing naturally
  → 90%+ accuracy on most classification tasks
  → Cost: ~$0.001 per 1000 classifications

Basic Sentiment Classifier

@Service
public class SentimentClassifier {

    private final ChatClient chatClient;

    public SentimentClassifier(ChatClient.Builder builder) {
        this.chatClient = builder
                .defaultOptions(OpenAiChatOptions.builder()
                        .withModel("gpt-4o-mini")
                        .withTemperature(0.0f)  // deterministic
                        .build())
                .build();
    }

    public Sentiment classifySentiment(String text) {
        String result = chatClient.prompt()
                .system("""
                        Classify the sentiment of the given text.
                        Output ONLY one of: POSITIVE, NEGATIVE, NEUTRAL, MIXED
                        Do not explain. Output the label only.
                        """)
                .user(text)
                .call()
                .content()
                .trim()
                .toUpperCase();

        try {
            return Sentiment.valueOf(result);
        } catch (IllegalArgumentException e) {
            return Sentiment.NEUTRAL;  // default on parse failure
        }
    }

    public SentimentDetail classifyWithScore(String text) {
        return chatClient.prompt()
                .system("Classify sentiment. Return JSON with label and confidence 0.0-1.0")
                .user(text)
                .call()
                .entity(SentimentDetail.class);
    }
}

enum Sentiment { POSITIVE, NEGATIVE, NEUTRAL, MIXED }
record SentimentDetail(String label, double confidence, String reason) {}

Multi-Label Topic Classifier

@Service
public class TopicClassifier {

    private final ChatClient chatClient;

    private static final List<String> TOPICS = List.of(
            "billing", "technical_issue", "feature_request",
            "complaint", "general_inquiry", "refund_request",
            "account_access", "shipping"
    );

    public TopicClassifier(ChatClient.Builder builder) {
        this.chatClient = builder
                .defaultOptions(OpenAiChatOptions.builder()
                        .withModel("gpt-4o-mini")
                        .withTemperature(0.0f)
                        .build())
                .build();
    }

    // Single-label classification
    public String classify(String text) {
        return chatClient.prompt()
                .user("""
                      Classify this customer message into ONE category.
                      Categories: %s
                      Message: %s
                      Output ONLY the category name.
                      """.formatted(String.join(", ", TOPICS), text))
                .call()
                .content()
                .trim()
                .toLowerCase();
    }

    // Multi-label — message can belong to multiple categories
    public List<String> classifyMultiLabel(String text) {
        String result = chatClient.prompt()
                .user("""
                      Which categories apply to this message? (can be multiple)
                      Categories: %s
                      Message: %s
                      Output as comma-separated list. Example: billing,complaint
                      """.formatted(String.join(", ", TOPICS), text))
                .call()
                .content()
                .trim();

        return Arrays.stream(result.split(","))
                .map(String::trim)
                .map(String::toLowerCase)
                .filter(TOPICS::contains)
                .toList();
    }
}

Content Moderation Service

public record ModerationResult(
        boolean safe,
        List<String> violations,    // e.g., ["spam", "harassment"]
        double toxicityScore,       // 0.0-1.0
        String action               // "allow", "flag_for_review", "block"
) {}

@Service
public class ContentModerationService {

    private final ChatClient chatClient;

    public ContentModerationService(ChatClient.Builder builder) {
        this.chatClient = builder
                .defaultOptions(OpenAiChatOptions.builder()
                        .withModel("gpt-4o-mini")
                        .withTemperature(0.0f)
                        .build())
                .build();
    }

    public ModerationResult moderate(String userContent) {
        ModerationResult result = chatClient.prompt()
                .system("""
                        You are a content moderation system.
                        Analyze the content for: spam, harassment, hate_speech,
                        violence, adult_content, misinformation, self_harm.
                        Be accurate — false positives harm legitimate users.
                        """)
                .user("Moderate this content: " + userContent)
                .call()
                .entity(ModerationResult.class);

        // Override action based on score
        if (result.toxicityScore() > 0.9) {
            return new ModerationResult(false, result.violations(),
                    result.toxicityScore(), "block");
        } else if (result.toxicityScore() > 0.5) {
            return new ModerationResult(false, result.violations(),
                    result.toxicityScore(), "flag_for_review");
        }

        return result;
    }
}

Batch Classification

@Service
public class BatchClassificationService {

    private final SentimentClassifier   sentimentClassifier;
    private final TopicClassifier       topicClassifier;

    // Classify multiple texts efficiently using virtual threads
    public Map<String, ClassificationResult> classifyBatch(List<String> texts) {
        Map<String, ClassificationResult> results = new ConcurrentHashMap<>();

        try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
            List<Future<?>> futures = texts.stream()
                    .map(text -> executor.submit(() -> {
                        Sentiment sentiment = sentimentClassifier.classifySentiment(text);
                        String topic = topicClassifier.classify(text);
                        results.put(text, new ClassificationResult(sentiment, topic));
                    }))
                    .toList();

            futures.forEach(f -> {
                try { f.get(); } catch (Exception e) {
                    System.err.println("Classification failed: " + e.getMessage());
                }
            });
        }

        return results;
    }
}

record ClassificationResult(Sentiment sentiment, String topic) {}

Output

// Sentiment classification
classifySentiment("The product quality is amazing, exceeded my expectations!")
→ POSITIVE

classifySentiment("It works but the price is way too high for what you get")
→ MIXED

// Topic classification
classify("I've been charged twice for the same order")
→ billing

classifyMultiLabel("I can't log in and I need a refund for my subscription")
→ ["account_access", "refund_request"]

// Content moderation
moderate("Buy cheap products now!!! Click here http://spam.com")
→ ModerationResult[safe=false, violations=["spam"], toxicityScore=0.85, action="block"]

moderate("How do I reset my password?")
→ ModerationResult[safe=true, violations=[], toxicityScore=0.02, action="allow"]

Key Points

Always set temperature=0.0 for classification — you want the same input to always produce the same label
Use gpt-4o-mini for classification — it's accurate for most categories and costs 20x less than gpt-4o
Request the model to output only the label (not an explanation) — it reduces tokens and eliminates parsing complexity
Use Java virtual threads (Executors.newVirtualThreadPerTaskExecutor()) for batch classification — each blocking AI call runs in its own virtual thread at minimal overhead
Add a fallback enum value (NEUTRAL, "unknown") and catch IllegalArgumentException on enum parsing — AI occasionally returns unexpected variations

Spring AI Text Classification — Sentiment Analysis, Topic Detection, and Content Moderation

Spring AI Text Classification — Sentiment Analysis, Topic Detection, and Content Moderation

Why AI-Based Classification Beats Traditional ML

Basic Sentiment Classifier

Multi-Label Topic Classifier

Content Moderation Service

Batch Classification

Output

Key Points

Comments