Skip to content

LynseDB Embedded And Server Roadmap

This document turns the embedded/server vector database plan into an engineering checklist. The near-term goal is to make LynseDB a reliable embedded-first database that can also run as a standalone single-node service.

Product Positioning

LynseDB should optimize for this shape:

  • Embedded-first local vector database for Python applications.
  • Server-optional deployment with the same storage engine and data directory.
  • Single-node production reliability before distributed features.
  • Clear APIs, stable storage format, and predictable recovery behavior.

The target user experience is:

  • Use LocalClient in notebooks, scripts, and applications without running a separate service.
  • Run lynsedb serve against the same data directory when a process-safe or remotely accessible service is needed.
  • Move from embedded mode to server mode without data migration.

Current Baseline

The repository already has the main building blocks:

  • Rust core engine with collections, vector storage, field storage, WAL, ID map, tombstones, and compaction.
  • Multiple vector search paths: Flat, IVF, HNSW, DiskANN, PQ, RaBitQ, PolarVec, SQ8, and binary distances.
  • Python local client through PyO3 bindings.
  • HTTP server and Python HTTP client.
  • Basic API-key authentication.
  • Metadata filtering through standard SQL-style where expressions.

Milestone 1: Reliability Core

This is the first coding priority. No higher-level feature should depend on undefined durability or ID semantics.

  • Define write visibility and commit semantics.
  • Make WAL replay idempotent.
  • Store user IDs in WAL records.
  • Ensure WAL tracks vector record count even when metadata fields are absent.
  • Recover custom user IDs after an uncommitted batch.
  • Prevent duplicate ID ambiguity on plain insert.
  • Add explicit upsert/update behavior after insert semantics are stable.
  • Add crash/reopen tests for vectors, IDs, fields, tombstones, and indexes.
  • Ensure index metadata and index files are atomically swapped.
  • Define and persist storage format version metadata.

Milestone 2: Embedded Safety

Embedded mode should be safe and unsurprising.

  • Add database and collection file locks.
  • Define supported concurrency model: single process writer, multi-thread safe.
  • Add read-only open mode for safe multi-process readers.
  • Add explicit flush, close, and checkpoint APIs.
  • Improve errors when another writer owns the data directory.
  • Keep LocalClient and HTTPClient behavior aligned through shared tests.

Milestone 3: Backup And Restore

Single-node deployment needs operational escape hatches.

  • Add collection snapshots.
  • Add restore collection from snapshot.
  • Add database-level snapshots.
  • Add restore database from snapshot.
  • Support consistent snapshot while reads continue.
  • Define write-blocking snapshot isolation, with LSN-based snapshots as a future option.
  • Add import/export for JSONL metadata plus binary vectors.
  • Add migration hooks for future storage format upgrades.

Milestone 4: Server Mode

The standalone service should feel like a real database process.

  • Add CLI: lynsedb serve --data-dir ./data --host 0.0.0.0 --port 7637.
  • Add config file support with environment variable overrides.
  • Add /healthz, /readyz, and /metrics.
  • Add graceful shutdown with WAL flush and checkpoint.
  • Add request limits, batch limits, and timeout configuration.
  • Generate OpenAPI documentation from the HTTP API.
  • Publish Docker Compose, systemd, and Kubernetes examples.

Milestone 5: Query Capabilities

After the storage core is trustworthy, broaden retrieval quality.

  • Add named vector fields per record.
  • Allow each vector field to have its own dimension, metric, and index.
  • Add sparse vector storage and sparse inner-product search.
  • Add BM25 or inverted-index text retrieval.
  • Add hybrid search with RRF and weighted fusion.
  • Add rerank hooks for cross-encoders and LLM rerankers.
  • Expand metadata indexes: range, bitmap, keyword, datetime, and arrays.
  • Add filter explain/profile output.

Milestone 6: Observability And Governance

Production users need to understand and control resource usage.

  • [x] Add structured logs.
  • [x] Add Prometheus metrics for latency, QPS, WAL size, memory, disk, and index build progress.
  • [x] Add query profiling: filter matches, scanned vectors, index path, and rerank cost.
  • [x] Add collection-level limits for top_k, batch size, vector count, and memory.
  • [x] Add slow-query warnings.
  • [x] Add audit log for server mode.

Milestone 7: Ecosystem

LynseDB should fit into common AI application stacks.

  • Add LangChain integration.
  • Add LlamaIndex integration.
  • Add Haystack integration.
  • Add examples for OpenAI embeddings, sentence-transformers, and FastEmbed.
  • Add import tools for NumPy, Parquet, FAISS, Chroma, and Qdrant-style exports.
  • Publish reproducible benchmarks for recall, latency, memory, disk usage, and index build time.

Milestone 8: Distributed Features Later

Distributed work should wait until single-node semantics are stable.

  • Start with snapshot shipping and read replicas.
  • Add collection-level or partition-key sharding.
  • Add replication only after backup, restore, WAL, and manifest semantics are stable.
  • Add coordinator and rolling-upgrade support only when there is a clear operational story.

Immediate Coding Queue

  • [x] Fix WAL record counting for batches without metadata.
  • [x] Extend WAL segments to persist user IDs.
  • [x] Make collection recovery use persisted IDs and skip already-applied WAL rows.
  • [x] Add insert duplicate-ID validation.
  • [x] Add tests for reopen after uncommitted vector-only and custom-ID batches.
  • [x] Add tests documenting duplicate insert behavior.
  • [x] Add explicit upsert_items after insert behavior is stable.
  • [x] Add reopen tests for upserted vectors, IDs, fields, and tombstones.
  • [x] Extend crash/reopen coverage to persisted index files.
  • [x] Ensure index metadata and index files are atomically swapped.
  • [x] Define and persist storage format version metadata.

Milestone 2 Immediate Coding Queue

  • [x] Add a collection-level writer lock for embedded mode.
  • [x] Add explicit flush, checkpoint, and close APIs.
  • [x] Expose flush, checkpoint, and close through Python and HTTP clients.
  • [x] Add regression tests for writer-lock rejection and checkpoint recovery.
  • [x] Add database-level and manager-level writer locks.
  • [x] Add read-only open mode for safe multi-process readers.
  • [x] Add graceful server shutdown that checkpoints open collections.

Milestone 3 Immediate Coding Queue

  • [x] Add collection filesystem snapshots.
  • [x] Add restore collection from snapshot.
  • [x] Expose snapshot and restore through Python local bindings.
  • [x] Expose snapshot and restore through HTTP server/client APIs.
  • [x] Add snapshot/restore regression tests.
  • [x] Add database-level filesystem snapshots.
  • [x] Add restore database from snapshot.
  • [x] Expose database snapshot and restore through Python/HTTP APIs.
  • [x] Add database snapshot/restore regression tests.
  • [x] Add consistent snapshot while reads continue.
  • [x] Add import/export for JSONL metadata plus binary vectors.

Milestone 4 Immediate Coding Queue

  • [x] Add CLI support for lynse serve --data-dir ... (with backward-compatible run / --root).
  • [x] Add /healthz and /readyz endpoints for liveness/readiness probes.
  • [x] Add /metrics endpoint with Prometheus gauges, request counters, and latency histogram.
  • [x] Add config-file support with environment-variable overrides.
  • [x] Make request limits and timeout settings configurable.
  • [x] Generate and publish OpenAPI docs from HTTP routes.
  • [x] Publish Docker Compose, systemd, and Kubernetes examples.

Milestone 5 Immediate Coding Queue

  • [x] Add /search_profile with filter matches, estimated scanned vectors, index path, and timing data.
  • [x] Add baseline BM25 metadata text search over current field rows.
  • [x] Add hybrid vector/text search with RRF and weighted fusion.
  • [x] Expose query profile, text search, and hybrid search through local and HTTP clients.
  • [x] Add named vector fields per record.
  • [x] Allow each vector field to define its own dimension, metric, and index.
  • [x] Add sparse vector storage and sparse inner-product search.
  • [x] Add persistent inverted indexes for high-throughput text retrieval.
  • [x] Add external rerank hooks for cross-encoders and LLM rerankers.
  • [x] Expand metadata indexes for range, bitmap, keyword, datetime, and arrays.

Milestone 6 Immediate Coding Queue

  • [x] Extend /metrics with error counters by kind and estimated p50/p90/p99 latency gauges.
  • [x] Add Prometheus metrics for WAL size, memory usage, disk usage, and index build progress.
  • [x] Add structured request logging with request IDs and slow-query warnings.