Java SpringAI

Spring AI Testing — Unit and Integration Tests for AI-Powered Applications

Spring AI Testing — Unit and Integration Tests for AI-Powered Applications

Testing AI applications requires different strategies from regular Java applications. You don't want unit tests to make real API calls — they'd be slow and cost money. But you also need integration tests to verify that your prompts, parsers, and flows actually work end-to-end. This tutorial covers mocking, record-and-replay testing, and evaluation-based testing for Spring AI.

Testing Challenges for AI Apps

Challenge 1: Non-determinism
  LLM responses vary between calls → hard to assert exact output

Challenge 2: Cost
  Unit tests calling real APIs → expensive, slow CI pipelines

Challenge 3: Quality
  "Did the AI give a good answer?" is not a binary pass/fail

Solutions:
  Unit tests     → Mock ChatClient, test your code logic
  Contract tests → Record real responses, replay in tests
  Eval tests     → Use another AI call to judge output quality

Unit Testing — Mock ChatClient

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.model.ChatResponse;
import org.mockito.Mockito;

@ExtendWith(MockitoExtension.class)
class AiServiceTest {

    @Mock
    private ChatClient.Builder chatClientBuilder;

    @Mock
    private ChatClient chatClient;

    @Mock
    private ChatClient.ChatClientRequestSpec requestSpec;

    @Mock
    private ChatClient.CallResponseSpec callSpec;

    private AiService aiService;

    @BeforeEach
    void setup() {
        // Wire the fluent mock chain
        when(chatClientBuilder.defaultSystem(any())).thenReturn(chatClientBuilder);
        when(chatClientBuilder.build()).thenReturn(chatClient);
        when(chatClient.prompt()).thenReturn(requestSpec);
        when(requestSpec.user(anyString())).thenReturn(requestSpec);
        when(requestSpec.call()).thenReturn(callSpec);

        aiService = new AiService(chatClientBuilder);
    }

    @Test
    void testAskReturnsResponse() {
        when(callSpec.content()).thenReturn("Spring Boot is a Java framework");

        String result = aiService.ask("What is Spring Boot?");

        assertThat(result).isEqualTo("Spring Boot is a Java framework");
        verify(requestSpec).user("What is Spring Boot?");
    }

    @Test
    void testAskWithEmptyQuestionThrows() {
        assertThatThrownBy(() -> aiService.ask(""))
                .isInstanceOf(IllegalArgumentException.class);
    }
}

Testing Structured Output

@ExtendWith(MockitoExtension.class)
class BookExtractorTest {

    @Mock ChatClient.Builder builder;
    @Mock ChatClient         chatClient;
    @Mock ChatClient.ChatClientRequestSpec requestSpec;
    @Mock ChatClient.CallResponseSpec      callSpec;

    private BookExtractorService service;

    @BeforeEach
    void setup() {
        when(builder.build()).thenReturn(chatClient);
        when(chatClient.prompt()).thenReturn(requestSpec);
        when(requestSpec.user(any(Consumer.class))).thenReturn(requestSpec);
        when(requestSpec.call()).thenReturn(callSpec);
        service = new BookExtractorService(builder);
    }

    @Test
    void testExtractBookEntity() {
        // Simulate AI returning valid JSON
        when(callSpec.entity(Book.class)).thenReturn(
                new Book("Clean Code", "Robert Martin", 2008, "Programming",
                        "A guide to writing maintainable software")
        );

        Book result = service.extractBookDirect("Clean Code by Robert Martin...");

        assertThat(result.title()).isEqualTo("Clean Code");
        assertThat(result.author()).isEqualTo("Robert Martin");
        assertThat(result.year()).isEqualTo(2008);
    }
}

Integration Test with MockMvc and Mocked AI

@WebMvcTest(AiController.class)
class AiControllerTest {

    @Autowired MockMvc mockMvc;
    @MockBean  AiService aiService;

    @Test
    void testAskEndpoint() throws Exception {
        when(aiService.ask("What is RAG?"))
                .thenReturn("RAG is Retrieval Augmented Generation");

        mockMvc.perform(get("/ai/ask")
                        .param("q", "What is RAG?"))
                .andExpect(status().isOk())
                .andExpect(content().string("RAG is Retrieval Augmented Generation"));
    }

    @Test
    void testAskWithEmptyParam() throws Exception {
        mockMvc.perform(get("/ai/ask").param("q", ""))
                .andExpect(status().isBadRequest());
    }
}

Record-and-Replay Integration Test

@SpringBootTest
@ActiveProfiles("test")
class RagIntegrationTest {

    @Autowired RagChatService ragService;
    @Autowired DocumentIngestionService ingestion;

    @Test
    void testRagAnswersFromLoadedDocuments() {
        // Load deterministic test data
        ingestion.ingestText(
                "Java records are immutable data classes introduced in Java 16. " +
                "They auto-generate constructors, getters, equals, hashCode, and toString.",
                "test-doc");

        // This DOES call the AI — run with @Tag("integration") and skip in fast CI
        String answer = ragService.chat("test-session-1", "What are Java records?");

        assertThat(answer).containsIgnoringCase("record");
        assertThat(answer).containsIgnoringCase("java 16");
    }
}

// application-test.properties
// spring.ai.openai.api-key=test-key-disable-real-calls
// spring.ai.vectorstore.pgvector.initialize-schema=true

AI Quality Evaluation Test

@SpringBootTest
@Tag("evaluation")   // run separately — slow and costly
class AiQualityEvaluationTest {

    @Autowired ChatClient chatClient;

    // Evaluator: use a separate AI call to judge the answer quality
    private boolean isGoodAnswer(String question, String answer) {
        String evaluation = chatClient.prompt()
                .user("""
                      Question: %s
                      Answer: %s

                      Rate this answer on a scale of 1-5 for:
                      - Accuracy (1=wrong, 5=completely correct)
                      - Relevance (1=off-topic, 5=directly answers the question)

                      Respond with ONLY: PASS or FAIL
                      PASS = both scores are 4 or 5
                      FAIL = any score is 3 or below
                      """.formatted(question, answer))
                .call()
                .content()
                .trim();

        return evaluation.startsWith("PASS");
    }

    @ParameterizedTest
    @CsvSource({
        "What is @SpringBootApplication?, Spring Boot",
        "How do you define a REST endpoint in Spring?,@RestController",
        "What does @Transactional do?,transaction"
    })
    void testAnswerQuality(String question, String expectedKeyword) {
        String answer = chatClient.prompt().user(question).call().content();

        // Basic assertion
        assertThat(answer.toLowerCase()).contains(expectedKeyword.toLowerCase());

        // AI evaluation assertion
        assertThat(isGoodAnswer(question, answer))
                .as("Answer for '%s' should be rated PASS by evaluator", question)
                .isTrue();
    }
}

Test Configuration

# src/test/resources/application-test.properties
spring.ai.openai.api-key=${OPENAI_API_KEY:test-key}
spring.ai.vectorstore.simple.enabled=true   # use in-memory store for tests
spring.data.redis.port=16379                # use test Redis port

Key Points

  • Mock ChatClient by mocking the builder chain — since ChatClient uses a fluent API, each step needs its own mock
  • Use @MockBean AiService in @WebMvcTest to test controllers without any AI calls
  • Tag integration tests with @Tag("integration") and evaluation tests with @Tag("evaluation") — exclude them from fast unit test runs in CI
  • AI evaluator tests (using AI to judge AI) are expensive but valuable for regression testing when you change prompts
  • Use SimpleVectorStore in test profile instead of PGVector to avoid database dependency in tests
Topics: Java SpringAI
← Newer Post Older Post →