Tutorial: Indexing Guide¶
Indexes control the tradeoff between recall, latency, memory, disk usage, and build time.
The default choice¶
Start with a flat metric index:
Flat search is exhaustive and simple. It is a good baseline for correctness, small collections, and evaluation.
A practical first decision:
| Collection size and goal | Start with |
|---|---|
| development, tests, evaluation | FLAT-L2, FLAT-COS, or FLAT |
| low-latency online ANN | HNSW-L2, HNSW-Cos, or HNSW |
| large collections with explicit probe tuning | IVF-L2, IVF-COS, or IVF |
| memory pressure from graph indexes | DiskANN-L2, DiskANN-Cos, or DiskANN |
| lower memory or disk footprint | SQ8, PQ, RaBitQ, or PolarVec variants |
| binary vectors | FLAT-HAMMING-BINARY, FLAT-JACCARD-BINARY, or IVF binary variants |
Metric names¶
| Index suffix | Metric | Best for |
|---|---|---|
FLAT, HNSW, IVF, DiskANN |
Inner product | normalized embeddings, maximum-score retrieval |
-L2 |
Squared L2 distance | Euclidean-distance embeddings |
-COS or -Cos |
Cosine similarity | embeddings where angular similarity matters |
-HAMMING-BINARY |
Hamming distance | binary vectors |
-JACCARD-BINARY |
Jaccard distance | binary sets |
Choose the metric that matches how your embedding model was trained.
Metric guidance:
- use cosine when your embedding model documentation recommends cosine;
- use inner product when embeddings are normalized and maximum score retrieval is desired;
- use L2 when Euclidean distance is meaningful for your model;
- use Hamming or Jaccard only for binary vectors or binary-set style features.
Index families¶
| Family | Example | Strength | Notes |
|---|---|---|---|
| Flat | FLAT-L2 |
highest recall and simplest behavior | Latency grows with collection size. |
| HNSW | HNSW-L2 |
low-latency ANN | Use nprobe as search breadth (ef_search). |
| IVF | IVF-L2 |
large collections with tunable probes | Requires n_clusters; use nprobe at search time. |
| DiskANN | DiskANN-L2 |
disk-friendly graph ANN | Useful when memory pressure matters. |
| Quantized flat | FLAT-IP-SQ8, FLAT-L2-PQ, FLAT-IP-RABITQ |
lower memory footprint | Some variants use two-pass search to preserve quality. |
Build lifecycle¶
Indexes are persisted with the collection. Reopening the collection reloads the index metadata and index files where applicable.
After a large initial load:
with collection.insert_session() as session:
session.bulk_add_items(items, enable_progress_bar=False)
collection.build_index("HNSW-L2")
collection.checkpoint()
After many incremental writes, rebuild or switch index modes when recall or latency has drifted from your target:
LynseDB can insert into existing graph indexes for supported paths, but a controlled rebuild after large changes is still the easiest way to re-baseline performance.
Build and remove indexes¶
collection.build_index("HNSW-L2")
print(collection.index_mode)
collection.remove_index()
print(collection.index_mode)
Removing an index returns the collection to flat search.
IVF parameters¶
IVF splits vectors into coarse clusters. n_clusters controls how many clusters
are built.
Rules:
n_clustersis used only for IVF indexes.- For non-IVF indexes,
n_clustersis allowed and ignored by the Python API. - For IVF indexes,
n_clustersmust be greater than zero. - More clusters usually reduce scanned vectors per query but can require higher
nprobefor recall.
Search with:
For IVF, nprobe is the number of clusters to scan. Higher values improve
recall and increase latency.
Starting values:
n_clusters=64for small experiments;n_clusters=256or1024for larger collections;nprobe=10as a default search starting point;- increase
nprobeuntil recall is acceptable against a flat baseline.
HNSW search breadth¶
For HNSW, nprobe is used as the search beam width:
Higher nprobe generally improves recall and increases latency.
Flat, PQ, RaBitQ, PolarVec, and named vector-field searches ignore nprobe.
Start with nprobe=32 or 64 for HNSW evaluation, then tune down for latency
or up for recall.
Approximate flat distance rounding¶
For flat IP, L2, and cosine paths, approx=True enables metric-specific
distance rounding controlled by eps.
Hamming and Jaccard binary metrics ignore approx=True and always use the exact
binary-distance path.
Use approx=True only after measuring quality on your own evaluation set.
Named vector indexes¶
Each named vector field has its own metric, dimension, and index:
collection.create_vector_field("image", dim=512, metric="l2")
collection.add_named_vectors("image", image_vectors, ids=image_ids)
collection.build_index("HNSW-L2", field_name="image")
result = collection.search(image_query, k=10, vector_field="image")
Remove only that field's index:
Rules for named fields:
- create the field before adding named vectors;
- add named vectors only for IDs that already exist in the primary collection;
- choose the field metric at creation time;
- build or remove the named field index with
field_name=...; wherefilters still use the row metadata fields.
Quantized indexes¶
Quantized indexes reduce memory or disk pressure. They are most useful when vector bandwidth or index size is the bottleneck.
| Family | Examples | When to try |
|---|---|---|
| SQ8 | FLAT-L2-SQ8, HNSW-L2-SQ8, IVF-COS-SQ8 |
You want scalar quantization with familiar index families. |
| PQ | FLAT-L2-PQ, FLAT-IP-PQ8, FLAT-IP-PQ16 |
You want product quantization and can evaluate recall tradeoffs. |
| RaBitQ | FLAT-IP-RABITQ, FLAT-L2-RABITQ |
You want aggressive binary-style compression. |
| PolarVec | FLAT-IP-POLARVEC4, FLAT-L2-POLARVEC |
You want training-free multi-bit quantization. |
Quantized indexes should be evaluated against a flat baseline. Use the same
queries, filters, and k values your application uses.
Binary indexes¶
Binary indexes are for binary vectors or binary-set style representations:
collection.build_index("FLAT-HAMMING-BINARY")
collection.build_index("IVF-JACCARD-BINARY", n_clusters=256)
For Hamming and Jaccard, lower distance is better. approx and eps do not
change binary-distance search behavior.
Practical tuning workflow¶
- Build
FLAT-*first and record quality on an evaluation set. - Try
HNSW-*for low-latency online search. - Try
IVF-*when you need more explicit recall/latency tuning. - Use quantized variants when memory or disk footprint is the bottleneck.
- Always compare recall against the flat baseline before deploying.
Evaluation loop:
def evaluate(index_mode, *, n_clusters=None, nprobe=10):
collection.build_index(index_mode, n_clusters=n_clusters)
result = collection.search(query, k=10, nprobe=nprobe)
return result.ids.tolist()
baseline = evaluate("FLAT-L2")
hnsw = evaluate("HNSW-L2", nprobe=64)
ivf = evaluate("IVF-L2", n_clusters=256, nprobe=20)
print(baseline)
print(hnsw)
print(ivf)
Use your own relevance labels when possible. If you do not have labels yet, measure overlap with the flat baseline as a first recall proxy.
Supported index names¶
All index names are case-insensitive. The examples below show the supported
spellings accepted by build_index().
Dense indexes:
collection.build_index("FLAT")
collection.build_index("FLAT-IP")
collection.build_index("FLAT-L2")
collection.build_index("FLAT-COS")
collection.build_index("FLAT-COSINE")
collection.build_index("FLAT-IP-SQ8")
collection.build_index("FLAT-L2-SQ8")
collection.build_index("FLAT-COS-SQ8")
collection.build_index("FLAT-COSINE-SQ8")
collection.build_index("HNSW")
collection.build_index("HNSW-IP")
collection.build_index("HNSW-L2")
collection.build_index("HNSW-COS")
collection.build_index("HNSW-COSINE")
collection.build_index("HNSW-IP-SQ8")
collection.build_index("HNSW-L2-SQ8")
collection.build_index("HNSW-COS-SQ8")
collection.build_index("HNSW-COSINE-SQ8")
collection.build_index("DiskANN")
collection.build_index("DiskANN-IP")
collection.build_index("DiskANN-L2")
collection.build_index("DiskANN-COS")
collection.build_index("DiskANN-COSINE")
collection.build_index("DiskANN-IP-SQ8")
collection.build_index("DiskANN-L2-SQ8")
collection.build_index("DiskANN-COS-SQ8")
collection.build_index("DiskANN-COSINE-SQ8")
collection.build_index("IVF", n_clusters=256)
collection.build_index("IVF-IP", n_clusters=256)
collection.build_index("IVF-L2", n_clusters=256)
collection.build_index("IVF-COS", n_clusters=256)
collection.build_index("IVF-COSINE", n_clusters=256)
collection.build_index("IVF-IP-SQ8", n_clusters=256)
collection.build_index("IVF-L2-SQ8", n_clusters=256)
collection.build_index("IVF-COS-SQ8", n_clusters=256)
collection.build_index("IVF-COSINE-SQ8", n_clusters=256)
Flat quantized variants:
collection.build_index("FLAT-IP-PQ")
collection.build_index("FLAT-L2-PQ")
collection.build_index("FLAT-COS-PQ")
collection.build_index("FLAT-COSINE-PQ")
collection.build_index("FLAT-IP-PQ8")
collection.build_index("FLAT-IP-PQ16")
collection.build_index("FLAT-L2-PQ8")
collection.build_index("FLAT-COS-PQ8")
collection.build_index("FLAT-IP-RABITQ")
collection.build_index("FLAT-L2-RABITQ")
collection.build_index("FLAT-COS-RABITQ")
collection.build_index("FLAT-COSINE-RABITQ")
collection.build_index("FLAT-IP-POLARVEC")
collection.build_index("FLAT-L2-POLARVEC")
collection.build_index("FLAT-COS-POLARVEC")
collection.build_index("FLAT-COSINE-POLARVEC")
collection.build_index("FLAT-IP-POLARVEC3")
collection.build_index("FLAT-IP-POLARVEC4")
collection.build_index("FLAT-IP-POLARVEC8")
PQ accepts FLAT-{IP,L2,COS,COSINE}-PQ and
FLAT-{IP,L2,COS,COSINE}-PQ<N>, where <N> is the requested number of
subspaces. If <N> is omitted, LynseDB chooses an automatic subspace count.
PolarVec accepts FLAT-{IP,L2,COS,COSINE}-POLARVEC and
FLAT-{IP,L2,COS,COSINE}-POLARVEC<N>, where <N> is a bit width from 1 to 8.
If <N> is omitted or invalid, LynseDB uses the default bit width.
Binary variants:
collection.build_index("FLAT-HAMMING-BINARY")
collection.build_index("FLAT-HAMMING")
collection.build_index("FLAT-JACCARD-BINARY")
collection.build_index("FLAT-JACCARD")
collection.build_index("IVF-HAMMING-BINARY", n_clusters=256)
collection.build_index("IVF-HAMMING", n_clusters=256)
collection.build_index("IVF-JACCARD-BINARY", n_clusters=256)
collection.build_index("IVF-JACCARD", n_clusters=256)
Troubleshooting index choices¶
| Symptom | Try |
|---|---|
| Results differ too much from expected neighbors | Compare against FLAT-*; increase nprobe; rebuild index. |
| IVF recall is low | Increase nprobe, reduce n_clusters, or improve training data coverage. |
| HNSW latency is high | Lower nprobe, try IVF, or use a more selective where filter. |
| Memory use is high | Try DiskANN or quantized variants; reduce unnecessary named vector fields. |
| Index build is slow | Build after bulk ingestion, not after every small batch; monitor /metrics in server mode. |
| Binary index scores look inverted | Remember Hamming/Jaccard are lower-is-better distances. |