Spring AI with Google Gemini — 1M Token Context and Multimodal AI in Spring Boot
Google Gemini 1.5 Pro has a 1 million token context window — roughly 750,000 words or 10 entire textbooks — making it uniquely suited for analyzing entire codebases, processing book-length documents, and working with large video or audio files. Spring AI integrates with Gemini through the Vertex AI starter with the same familiar ChatClient API.
When to Choose Gemini
Choose Gemini 1.5 Pro when:
✔ Your document exceeds 100k tokens (OpenAI/Claude limit)
✔ You need to analyze an entire codebase in one call
✔ Audio/video processing (Gemini supports these natively)
✔ Cost: $1.25/M input tokens vs $3.00 for Claude Sonnet
Choose Gemini 1.5 Flash when:
✔ High-volume classification or extraction
✔ Price: $0.075/M input tokens — cheapest major model
✔ Speed: lowest latency for simple tasks
Setup — Google Cloud Project
# Prerequisites:
# 1. Create a Google Cloud project
# 2. Enable Vertex AI API: gcloud services enable aiplatform.googleapis.com
# 3. Authenticate: gcloud auth application-default login
# Or set service account key:
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
Maven Dependency
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-vertex-ai-gemini-spring-boot-starter</artifactId>
</dependency>
application.properties
spring.ai.vertex.ai.gemini.project-id=${GOOGLE_CLOUD_PROJECT_ID}
spring.ai.vertex.ai.gemini.location=us-central1
spring.ai.vertex.ai.gemini.chat.options.model=gemini-1.5-flash-001
spring.ai.vertex.ai.gemini.chat.options.temperature=0.7
spring.ai.vertex.ai.gemini.chat.options.candidate-count=1
Basic Chat Service
@Service
public class GeminiService {
private final ChatClient chatClient;
public GeminiService(ChatClient.Builder builder) {
this.chatClient = builder
.defaultSystem("You are a helpful Java programming assistant.")
.build();
}
public String ask(String question) {
return chatClient.prompt()
.user(question)
.call()
.content();
}
}
Massive Document Analysis with 1M Context
@Service
public class LargeDocumentService {
private final ChatClient geminiPro;
public LargeDocumentService(ChatClient.Builder builder) {
// Use Gemini 1.5 Pro for large context tasks
this.geminiPro = builder
.defaultOptions(VertexAiGeminiChatOptions.builder()
.withModel("gemini-1.5-pro-001")
.withTemperature(0.3f)
.build())
.build();
}
// Analyze entire codebase without chunking
public String analyzeCodebase(String entireCodebaseText) {
return geminiPro.prompt()
.user("""
This is a complete Java Spring Boot codebase.
Analyze it and provide:
1. Architecture overview and patterns used
2. Security vulnerabilities found
3. Performance bottlenecks
4. Missing test coverage areas
5. Recommended refactoring priorities
Codebase:
%s
""".formatted(entireCodebaseText))
.call()
.content();
}
// No need for RAG chunking with Gemini — pass entire document
public String analyzeDocument(String fullDocument, String task) {
return geminiPro.prompt()
.user(task + "\n\nFull Document:\n" + fullDocument)
.call()
.content();
}
}
Gemini Multimodal — Text + Images
import org.springframework.ai.chat.messages.UserMessage;
import org.springframework.ai.chat.messages.Media;
@Service
public class GeminiVisionService {
private final ChatClient chatClient;
public GeminiVisionService(ChatClient.Builder builder) {
this.chatClient = builder.build();
}
public String analyzeImage(byte[] imageBytes, String mimeType, String question) {
UserMessage message = new UserMessage(
question,
List.of(new Media(MimeType.valueOf(mimeType),
new ByteArrayResource(imageBytes)))
);
return chatClient.prompt()
.messages(message)
.call()
.content();
}
public String compareImages(byte[] image1, byte[] image2, String question) {
UserMessage message = new UserMessage(
question,
List.of(
new Media(MimeTypeUtils.IMAGE_PNG, new ByteArrayResource(image1)),
new Media(MimeTypeUtils.IMAGE_PNG, new ByteArrayResource(image2))
)
);
return chatClient.prompt().messages(message).call().content();
}
}
Streaming with Gemini
@GetMapping(value = "/gemini/stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<String> streamGemini(@RequestParam String q) {
return chatClient.prompt()
.user(q)
.stream()
.content();
}
Gemini for Structured Output
public record TechStack(
String language,
String framework,
String database,
List<String> dependencies,
String buildTool
) {}
public TechStack detectTechStack(String readmeOrPomContent) {
return chatClient.prompt()
.user("Detect the technology stack from this project file: " + readmeOrPomContent)
.call()
.entity(TechStack.class);
}
Output
// analyzeCodebase(entire Spring Boot project)
Architecture Overview:
- Clean layered architecture: Controller → Service → Repository
- Spring Boot 3.3 with Jakarta EE 9 namespaces
- Hibernate 6.x for JPA with PostgreSQL
Security Issues Found:
1. UserController.java:45 — SQL injection risk in custom @Query
2. AuthService.java:78 — password comparison using == instead of .equals()
3. Missing @PreAuthorize on admin endpoints
Performance Bottlenecks:
1. N+1 query in OrderService.findAllWithItems() — missing @EntityGraph
2. No caching on frequently-called getProductById()
Missing Test Coverage:
1. UserService has 0% test coverage
2. Exception paths in PaymentService not tested
Key Points
- Gemini 1.5 Pro's 1M context window eliminates the need for RAG chunking on moderately-sized document sets
- Gemini 1.5 Flash is the most cost-effective model for high-volume classification — 50x cheaper than GPT-4o
- The Spring AI Vertex AI starter requires Google Cloud credentials — use Application Default Credentials (gcloud auth) or a service account key
- All Spring AI features (streaming, structured output, tool calling, advisors) work identically with Gemini
- For code analysis, passing the full source to Gemini Pro is often better than chunked RAG — the model understands cross-file relationships
Comments