Java SpringAI

Spring AI Vector Store Migration — Switch Providers Without Rewriting Code

Spring AI Vector Store Migration — Switch Providers Without Rewriting Code

Spring AI's VectorStore interface abstracts over 15+ providers (PGVector, Chroma, Pinecone, Weaviate, Milvus, Elasticsearch, Redis, and more). This tutorial shows how to migrate between vector stores with zero application code changes, compare providers, and choose the right one for your use case.

Vector Store Provider Comparison

Provider       Type         Scale         Cost           Best For
────────────────────────────────────────────────────────────────────────────
PGVector       PostgreSQL   Millions      Free (self)    Existing Postgres DB
SimpleVectorStore  In-memory  Thousands  Free           Dev/prototyping
Chroma         Standalone   Millions      Free (self)    Dev, small projects
Pinecone       SaaS         Billions      $70+/mo        Production SaaS, scale
Weaviate       Hybrid       Billions      Free/paid      Multi-modal search
Milvus         Standalone   Billions      Free (self)    Billion-scale vectors
Elasticsearch  Hybrid       Billions      License        Existing ES users
Redis          In-memory    Millions      License        Low-latency use cases

The VectorStore Interface — Your Migration Shield

// Your service code — NEVER depends on a specific provider
@Service
public class DocumentSearchService {

    // VectorStore interface — works with ANY provider
    private final VectorStore vectorStore;

    public DocumentSearchService(VectorStore vectorStore) {
        this.vectorStore = vectorStore;
    }

    public void store(String content, Map<String, Object> metadata) {
        vectorStore.add(List.of(new Document(content, metadata)));
    }

    public List<Document> search(String query, int topK) {
        return vectorStore.similaritySearch(
                SearchRequest.query(query).withTopK(topK));
    }
}
// This exact same code works with ANY vector store below — zero changes

Provider 1 — SimpleVectorStore (Development)

<!-- No extra dependency — included in spring-ai-core -->
@Bean
public VectorStore vectorStore(EmbeddingModel embeddingModel) {
    // In-memory, persists to JSON file for simple dev scenarios
    SimpleVectorStore store = SimpleVectorStore.builder(embeddingModel).build();

    // Optionally persist to disk
    Path persistPath = Path.of("vector-store.json");
    if (Files.exists(persistPath)) {
        store.load(persistPath.toFile());
    }
    return store;
}

Provider 2 — PGVector (Production with PostgreSQL)

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId>
</dependency>
spring.ai.vectorstore.pgvector.initialize-schema=true
spring.ai.vectorstore.pgvector.dimensions=1536
spring.ai.vectorstore.pgvector.distance-type=cosine_distance
spring.ai.vectorstore.pgvector.index-type=hnsw
spring.datasource.url=jdbc:postgresql://localhost:5432/vectordb

Provider 3 — Chroma (Standalone Vector Database)

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-chroma-store-spring-boot-starter</artifactId>
</dependency>
# Run Chroma: docker run -p 8000:8000 chromadb/chroma
spring.ai.vectorstore.chroma.client.host=localhost
spring.ai.vectorstore.chroma.client.port=8000
spring.ai.vectorstore.chroma.collection-name=my-documents

Provider 4 — Pinecone (Managed SaaS, Billion-Scale)

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-pinecone-store-spring-boot-starter</artifactId>
</dependency>
spring.ai.vectorstore.pinecone.api-key=${PINECONE_API_KEY}
spring.ai.vectorstore.pinecone.index-name=my-index
spring.ai.vectorstore.pinecone.environment=us-east1-gcp
spring.ai.vectorstore.pinecone.project-id=my-project

Provider 5 — Redis (Low-Latency, Existing Redis Infrastructure)

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-redis-store-spring-boot-starter</artifactId>
</dependency>
spring.ai.vectorstore.redis.uri=redis://localhost:6379
spring.ai.vectorstore.redis.index=my-vector-index
spring.ai.vectorstore.redis.prefix=ai:docs:

Migration Script — Copy Data Between Stores

@Service
public class VectorStoreMigrator {

    // Migration between any two VectorStore implementations
    public void migrate(VectorStore source, VectorStore target, int batchSize) {
        System.out.println("Starting vector store migration...");

        // Retrieve all documents from source (paged)
        List<Document> batch;
        int offset = 0;
        int total  = 0;

        do {
            // Fetch page from source using a broad query
            batch = source.similaritySearch(
                    SearchRequest.query("*")
                            .withTopK(batchSize)
                            .withSimilarityThreshold(-1.0)  // include all
            );

            if (!batch.isEmpty()) {
                target.add(batch);
                total += batch.size();
                System.out.printf("Migrated %d documents...%n", total);
            }

            offset += batchSize;
        } while (batch.size() == batchSize);

        System.out.printf("Migration complete: %d documents migrated%n", total);
    }
}

Choosing the Right Provider

Decision tree:

Do you already have PostgreSQL?
  YES → Use PGVector (no new infrastructure)
  NO  → Continue...

Are you prototyping or building a demo?
  YES → Use SimpleVectorStore or Chroma (zero setup)
  NO  → Continue...

Will you have > 10 million vectors?
  YES → Pinecone (managed) or Milvus (self-hosted)
  NO  → PGVector or Chroma handles it fine

Do you need sub-millisecond search latency?
  YES → Redis Vector or Weaviate
  NO  → PGVector is sufficient (5-50ms typical)

Are you on a budget?
  YES → PGVector (free, uses existing DB)
  NO  → Pinecone (easiest managed option)

Key Points

  • Never import a provider-specific class in your service code — only use VectorStore, SearchRequest, and Document from Spring AI core
  • Start with PGVector in production if you already run PostgreSQL — adding the pgvector extension costs nothing and eliminates a separate infrastructure component
  • The HNSW index in PGVector handles millions of vectors efficiently — you don't need Pinecone unless you're at the hundreds-of-millions scale
  • Switching providers requires only a dependency change and application.properties update — your business logic code stays identical
  • Test your vector store choice with realistic production data volumes before committing — performance characteristics vary significantly between providers at scale
Topics: Java SpringAI
← Newer Post Older Post →