Java9R: Spring AI with Google Gemini — 1M Token Context and Multimodal AI in Spring Boot

Spring AI with Google Gemini — 1M Token Context and Multimodal AI in Spring Boot

Google Gemini 1.5 Pro has a 1 million token context window — roughly 750,000 words or 10 entire textbooks — making it uniquely suited for analyzing entire codebases, processing book-length documents, and working with large video or audio files. Spring AI integrates with Gemini through the Vertex AI starter with the same familiar ChatClient API.

When to Choose Gemini

Choose Gemini 1.5 Pro when:
  ✔ Your document exceeds 100k tokens (OpenAI/Claude limit)
  ✔ You need to analyze an entire codebase in one call
  ✔ Audio/video processing (Gemini supports these natively)
  ✔ Cost: $1.25/M input tokens vs $3.00 for Claude Sonnet

Choose Gemini 1.5 Flash when:
  ✔ High-volume classification or extraction
  ✔ Price: $0.075/M input tokens — cheapest major model
  ✔ Speed: lowest latency for simple tasks

Setup — Google Cloud Project

# Prerequisites:
# 1. Create a Google Cloud project
# 2. Enable Vertex AI API: gcloud services enable aiplatform.googleapis.com
# 3. Authenticate: gcloud auth application-default login

# Or set service account key:
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json

Maven Dependency

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-vertex-ai-gemini-spring-boot-starter</artifactId>
</dependency>

application.properties

spring.ai.vertex.ai.gemini.project-id=${GOOGLE_CLOUD_PROJECT_ID}
spring.ai.vertex.ai.gemini.location=us-central1
spring.ai.vertex.ai.gemini.chat.options.model=gemini-1.5-flash-001
spring.ai.vertex.ai.gemini.chat.options.temperature=0.7
spring.ai.vertex.ai.gemini.chat.options.candidate-count=1

Basic Chat Service

@Service
public class GeminiService {

    private final ChatClient chatClient;

    public GeminiService(ChatClient.Builder builder) {
        this.chatClient = builder
                .defaultSystem("You are a helpful Java programming assistant.")
                .build();
    }

    public String ask(String question) {
        return chatClient.prompt()
                .user(question)
                .call()
                .content();
    }
}

Massive Document Analysis with 1M Context

@Service
public class LargeDocumentService {

    private final ChatClient geminiPro;

    public LargeDocumentService(ChatClient.Builder builder) {
        // Use Gemini 1.5 Pro for large context tasks
        this.geminiPro = builder
                .defaultOptions(VertexAiGeminiChatOptions.builder()
                        .withModel("gemini-1.5-pro-001")
                        .withTemperature(0.3f)
                        .build())
                .build();
    }

    // Analyze entire codebase without chunking
    public String analyzeCodebase(String entireCodebaseText) {
        return geminiPro.prompt()
                .user("""
                      This is a complete Java Spring Boot codebase.
                      Analyze it and provide:
                      1. Architecture overview and patterns used
                      2. Security vulnerabilities found
                      3. Performance bottlenecks
                      4. Missing test coverage areas
                      5. Recommended refactoring priorities

                      Codebase:
                      %s
                      """.formatted(entireCodebaseText))
                .call()
                .content();
    }

    // No need for RAG chunking with Gemini — pass entire document
    public String analyzeDocument(String fullDocument, String task) {
        return geminiPro.prompt()
                .user(task + "\n\nFull Document:\n" + fullDocument)
                .call()
                .content();
    }
}

Gemini Multimodal — Text + Images

import org.springframework.ai.chat.messages.UserMessage;
import org.springframework.ai.chat.messages.Media;

@Service
public class GeminiVisionService {

    private final ChatClient chatClient;

    public GeminiVisionService(ChatClient.Builder builder) {
        this.chatClient = builder.build();
    }

    public String analyzeImage(byte[] imageBytes, String mimeType, String question) {
        UserMessage message = new UserMessage(
                question,
                List.of(new Media(MimeType.valueOf(mimeType),
                        new ByteArrayResource(imageBytes)))
        );

        return chatClient.prompt()
                .messages(message)
                .call()
                .content();
    }

    public String compareImages(byte[] image1, byte[] image2, String question) {
        UserMessage message = new UserMessage(
                question,
                List.of(
                    new Media(MimeTypeUtils.IMAGE_PNG, new ByteArrayResource(image1)),
                    new Media(MimeTypeUtils.IMAGE_PNG, new ByteArrayResource(image2))
                )
        );

        return chatClient.prompt().messages(message).call().content();
    }
}

Streaming with Gemini

@GetMapping(value = "/gemini/stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<String> streamGemini(@RequestParam String q) {
    return chatClient.prompt()
            .user(q)
            .stream()
            .content();
}

Gemini for Structured Output

public record TechStack(
        String language,
        String framework,
        String database,
        List<String> dependencies,
        String buildTool
) {}

public TechStack detectTechStack(String readmeOrPomContent) {
    return chatClient.prompt()
            .user("Detect the technology stack from this project file: " + readmeOrPomContent)
            .call()
            .entity(TechStack.class);
}

Output

// analyzeCodebase(entire Spring Boot project)
Architecture Overview:
  - Clean layered architecture: Controller → Service → Repository
  - Spring Boot 3.3 with Jakarta EE 9 namespaces
  - Hibernate 6.x for JPA with PostgreSQL

Security Issues Found:
  1. UserController.java:45 — SQL injection risk in custom @Query
  2. AuthService.java:78 — password comparison using == instead of .equals()
  3. Missing @PreAuthorize on admin endpoints

Performance Bottlenecks:
  1. N+1 query in OrderService.findAllWithItems() — missing @EntityGraph
  2. No caching on frequently-called getProductById()

Missing Test Coverage:
  1. UserService has 0% test coverage
  2. Exception paths in PaymentService not tested

Key Points

Gemini 1.5 Pro's 1M context window eliminates the need for RAG chunking on moderately-sized document sets
Gemini 1.5 Flash is the most cost-effective model for high-volume classification — 50x cheaper than GPT-4o
The Spring AI Vertex AI starter requires Google Cloud credentials — use Application Default Credentials (gcloud auth) or a service account key
All Spring AI features (streaming, structured output, tool calling, advisors) work identically with Gemini
For code analysis, passing the full source to Gemini Pro is often better than chunked RAG — the model understands cross-file relationships

Spring AI with Google Gemini — 1M Token Context and Multimodal AI in Spring Boot

Spring AI with Google Gemini — 1M Token Context and Multimodal AI in Spring Boot

When to Choose Gemini

Setup — Google Cloud Project

Maven Dependency

application.properties

Basic Chat Service

Massive Document Analysis with 1M Context

Gemini Multimodal — Text + Images

Streaming with Gemini

Gemini for Structured Output

Output

Key Points

Comments