Skip to content

Tutorial: Indexing Guide

Indexes control the tradeoff between recall, latency, memory, disk usage, and build time.

The default choice

Start with a flat metric index:

collection.build_index("FLAT-L2")

Flat search is exhaustive and simple. It is a good baseline for correctness, small collections, and evaluation.

A practical first decision:

Collection size and goal Start with
development, tests, evaluation FLAT-L2, FLAT-COS, or FLAT
low-latency online ANN HNSW-L2, HNSW-Cos, or HNSW
large collections with explicit probe tuning IVF-L2, IVF-COS, or IVF
memory pressure from graph indexes DiskANN-L2, DiskANN-Cos, or DiskANN
lower memory or disk footprint SQ8, PQ, RaBitQ, or PolarVec variants
binary vectors FLAT-HAMMING-BINARY, FLAT-JACCARD-BINARY, or IVF binary variants

Metric names

Index suffix Metric Best for
FLAT, HNSW, IVF, DiskANN Inner product normalized embeddings, maximum-score retrieval
-L2 Squared L2 distance Euclidean-distance embeddings
-COS or -Cos Cosine similarity embeddings where angular similarity matters
-HAMMING-BINARY Hamming distance binary vectors
-JACCARD-BINARY Jaccard distance binary sets

Choose the metric that matches how your embedding model was trained.

Metric guidance:

  • use cosine when your embedding model documentation recommends cosine;
  • use inner product when embeddings are normalized and maximum score retrieval is desired;
  • use L2 when Euclidean distance is meaningful for your model;
  • use Hamming or Jaccard only for binary vectors or binary-set style features.

Index families

Family Example Strength Notes
Flat FLAT-L2 highest recall and simplest behavior Latency grows with collection size.
HNSW HNSW-L2 low-latency ANN Use nprobe as search breadth (ef_search).
IVF IVF-L2 large collections with tunable probes Requires n_clusters; use nprobe at search time.
DiskANN DiskANN-L2 disk-friendly graph ANN Useful when memory pressure matters.
Quantized flat FLAT-IP-SQ8, FLAT-L2-PQ, FLAT-IP-RABITQ lower memory footprint Some variants use two-pass search to preserve quality.

Build lifecycle

Indexes are persisted with the collection. Reopening the collection reloads the index metadata and index files where applicable.

After a large initial load:

with collection.insert_session() as session:
    session.bulk_add_items(items, enable_progress_bar=False)

collection.build_index("HNSW-L2")
collection.checkpoint()

After many incremental writes, rebuild or switch index modes when recall or latency has drifted from your target:

collection.build_index("HNSW-L2")
collection.commit()

LynseDB can insert into existing graph indexes for supported paths, but a controlled rebuild after large changes is still the easiest way to re-baseline performance.

Build and remove indexes

collection.build_index("HNSW-L2")
print(collection.index_mode)

collection.remove_index()
print(collection.index_mode)

Removing an index returns the collection to flat search.

IVF parameters

IVF splits vectors into coarse clusters. n_clusters controls how many clusters are built.

collection.build_index("IVF-L2", n_clusters=256)

Rules:

  • n_clusters is used only for IVF indexes.
  • For non-IVF indexes, n_clusters is allowed and ignored by the Python API.
  • For IVF indexes, n_clusters must be greater than zero.
  • More clusters usually reduce scanned vectors per query but can require higher nprobe for recall.

Search with:

result = collection.search(query, k=10, nprobe=20)

For IVF, nprobe is the number of clusters to scan. Higher values improve recall and increase latency.

Starting values:

  • n_clusters=64 for small experiments;
  • n_clusters=256 or 1024 for larger collections;
  • nprobe=10 as a default search starting point;
  • increase nprobe until recall is acceptable against a flat baseline.

HNSW search breadth

For HNSW, nprobe is used as the search beam width:

collection.build_index("HNSW-L2")
result = collection.search(query, k=10, nprobe=64)

Higher nprobe generally improves recall and increases latency. Flat, PQ, RaBitQ, PolarVec, and named vector-field searches ignore nprobe.

Start with nprobe=32 or 64 for HNSW evaluation, then tune down for latency or up for recall.

Approximate flat distance rounding

For flat IP, L2, and cosine paths, approx=True enables metric-specific distance rounding controlled by eps.

result = collection.search(query, k=10, approx=True, eps=1e-4)

Hamming and Jaccard binary metrics ignore approx=True and always use the exact binary-distance path.

Use approx=True only after measuring quality on your own evaluation set.

Named vector indexes

Each named vector field has its own metric, dimension, and index:

collection.create_vector_field("image", dim=512, metric="l2")
collection.add_named_vectors("image", image_vectors, ids=image_ids)
collection.build_index("HNSW-L2", field_name="image")

result = collection.search(image_query, k=10, vector_field="image")

Remove only that field's index:

collection.remove_index(field_name="image")

Rules for named fields:

  • create the field before adding named vectors;
  • add named vectors only for IDs that already exist in the primary collection;
  • choose the field metric at creation time;
  • build or remove the named field index with field_name=...;
  • where filters still use the row metadata fields.

Quantized indexes

Quantized indexes reduce memory or disk pressure. They are most useful when vector bandwidth or index size is the bottleneck.

Family Examples When to try
SQ8 FLAT-L2-SQ8, HNSW-L2-SQ8, IVF-COS-SQ8 You want scalar quantization with familiar index families.
PQ FLAT-L2-PQ, FLAT-IP-PQ8, FLAT-IP-PQ16 You want product quantization and can evaluate recall tradeoffs.
RaBitQ FLAT-IP-RABITQ, FLAT-L2-RABITQ You want aggressive binary-style compression.
PolarVec FLAT-IP-POLARVEC4, FLAT-L2-POLARVEC You want training-free multi-bit quantization.

Quantized indexes should be evaluated against a flat baseline. Use the same queries, filters, and k values your application uses.

Binary indexes

Binary indexes are for binary vectors or binary-set style representations:

collection.build_index("FLAT-HAMMING-BINARY")
collection.build_index("IVF-JACCARD-BINARY", n_clusters=256)

For Hamming and Jaccard, lower distance is better. approx and eps do not change binary-distance search behavior.

Practical tuning workflow

  1. Build FLAT-* first and record quality on an evaluation set.
  2. Try HNSW-* for low-latency online search.
  3. Try IVF-* when you need more explicit recall/latency tuning.
  4. Use quantized variants when memory or disk footprint is the bottleneck.
  5. Always compare recall against the flat baseline before deploying.

Evaluation loop:

def evaluate(index_mode, *, n_clusters=None, nprobe=10):
    collection.build_index(index_mode, n_clusters=n_clusters)
    result = collection.search(query, k=10, nprobe=nprobe)
    return result.ids.tolist()

baseline = evaluate("FLAT-L2")
hnsw = evaluate("HNSW-L2", nprobe=64)
ivf = evaluate("IVF-L2", n_clusters=256, nprobe=20)

print(baseline)
print(hnsw)
print(ivf)

Use your own relevance labels when possible. If you do not have labels yet, measure overlap with the flat baseline as a first recall proxy.

Supported index names

All index names are case-insensitive. The examples below show the supported spellings accepted by build_index().

Dense indexes:

collection.build_index("FLAT")
collection.build_index("FLAT-IP")
collection.build_index("FLAT-L2")
collection.build_index("FLAT-COS")
collection.build_index("FLAT-COSINE")
collection.build_index("FLAT-IP-SQ8")
collection.build_index("FLAT-L2-SQ8")
collection.build_index("FLAT-COS-SQ8")
collection.build_index("FLAT-COSINE-SQ8")

collection.build_index("HNSW")
collection.build_index("HNSW-IP")
collection.build_index("HNSW-L2")
collection.build_index("HNSW-COS")
collection.build_index("HNSW-COSINE")
collection.build_index("HNSW-IP-SQ8")
collection.build_index("HNSW-L2-SQ8")
collection.build_index("HNSW-COS-SQ8")
collection.build_index("HNSW-COSINE-SQ8")

collection.build_index("DiskANN")
collection.build_index("DiskANN-IP")
collection.build_index("DiskANN-L2")
collection.build_index("DiskANN-COS")
collection.build_index("DiskANN-COSINE")
collection.build_index("DiskANN-IP-SQ8")
collection.build_index("DiskANN-L2-SQ8")
collection.build_index("DiskANN-COS-SQ8")
collection.build_index("DiskANN-COSINE-SQ8")

collection.build_index("IVF", n_clusters=256)
collection.build_index("IVF-IP", n_clusters=256)
collection.build_index("IVF-L2", n_clusters=256)
collection.build_index("IVF-COS", n_clusters=256)
collection.build_index("IVF-COSINE", n_clusters=256)
collection.build_index("IVF-IP-SQ8", n_clusters=256)
collection.build_index("IVF-L2-SQ8", n_clusters=256)
collection.build_index("IVF-COS-SQ8", n_clusters=256)
collection.build_index("IVF-COSINE-SQ8", n_clusters=256)

Flat quantized variants:

collection.build_index("FLAT-IP-PQ")
collection.build_index("FLAT-L2-PQ")
collection.build_index("FLAT-COS-PQ")
collection.build_index("FLAT-COSINE-PQ")
collection.build_index("FLAT-IP-PQ8")
collection.build_index("FLAT-IP-PQ16")
collection.build_index("FLAT-L2-PQ8")
collection.build_index("FLAT-COS-PQ8")
collection.build_index("FLAT-IP-RABITQ")
collection.build_index("FLAT-L2-RABITQ")
collection.build_index("FLAT-COS-RABITQ")
collection.build_index("FLAT-COSINE-RABITQ")
collection.build_index("FLAT-IP-POLARVEC")
collection.build_index("FLAT-L2-POLARVEC")
collection.build_index("FLAT-COS-POLARVEC")
collection.build_index("FLAT-COSINE-POLARVEC")
collection.build_index("FLAT-IP-POLARVEC3")
collection.build_index("FLAT-IP-POLARVEC4")
collection.build_index("FLAT-IP-POLARVEC8")

PQ accepts FLAT-{IP,L2,COS,COSINE}-PQ and FLAT-{IP,L2,COS,COSINE}-PQ<N>, where <N> is the requested number of subspaces. If <N> is omitted, LynseDB chooses an automatic subspace count.

PolarVec accepts FLAT-{IP,L2,COS,COSINE}-POLARVEC and FLAT-{IP,L2,COS,COSINE}-POLARVEC<N>, where <N> is a bit width from 1 to 8. If <N> is omitted or invalid, LynseDB uses the default bit width.

Binary variants:

collection.build_index("FLAT-HAMMING-BINARY")
collection.build_index("FLAT-HAMMING")
collection.build_index("FLAT-JACCARD-BINARY")
collection.build_index("FLAT-JACCARD")
collection.build_index("IVF-HAMMING-BINARY", n_clusters=256)
collection.build_index("IVF-HAMMING", n_clusters=256)
collection.build_index("IVF-JACCARD-BINARY", n_clusters=256)
collection.build_index("IVF-JACCARD", n_clusters=256)

Troubleshooting index choices

Symptom Try
Results differ too much from expected neighbors Compare against FLAT-*; increase nprobe; rebuild index.
IVF recall is low Increase nprobe, reduce n_clusters, or improve training data coverage.
HNSW latency is high Lower nprobe, try IVF, or use a more selective where filter.
Memory use is high Try DiskANN or quantized variants; reduce unnecessary named vector fields.
Index build is slow Build after bulk ingestion, not after every small batch; monitor /metrics in server mode.
Binary index scores look inverted Remember Hamming/Jaccard are lower-is-better distances.