Tutorial: Troubleshooting¶
This page maps common LynseDB symptoms to likely causes and fixes.
Installation¶
Native Windows does not work¶
Native Windows environments are not supported. Use one of:
- WSL 2 with Linux Python;
- Docker server mode;
- a Linux or macOS environment.
Import fails¶
Check the Python version:
LynseDB requires Python 3.9 or newer.
Reinstall in the active environment:
Connection¶
Cannot connect to remote server¶
Start the server:
Check the root endpoint:
Then connect:
Make sure the client URL includes http:// or https://.
Authentication failed¶
If the server was started with --api-key, pass the same key:
Raw HTTP requests need:
Public endpoints are /, /healthz, and /readyz. Other endpoints require
auth when an API key is configured.
Local data path is shared by multiple processes¶
Do not let independent writer processes share the same local root path. Use HTTP server mode so one process owns the data directory:
Database and collection setup¶
Database does not exist¶
Use create_database() once, then get_database() later:
Inspect names:
Collection does not exist¶
Create or open:
Open only if it exists:
Inspect names:
Data disappeared after a test¶
Look for drop_if_exists=True:
client.create_database("app", drop_if_exists=True)
db.require_collection("docs", dim=768, drop_if_exists=True)
These flags are destructive and should be limited to tests or explicit reset scripts.
Ingestion¶
Vector dimension error¶
The collection dimension is fixed:
Every inserted primary vector must have length 768:
For a different embedding model dimension, create a different collection or a named vector field with that dimension.
Duplicate ID error¶
add_item() and bulk_add_items() expect new IDs. Use upsert to replace or
insert by ID:
Check existing IDs:
Writes are not visible as expected¶
Use insert_session() or call commit():
with collection.insert_session() as session:
session.add_item(vector, id=1)
# or
collection.add_item(vector, id=2, buffer_size=False)
collection.commit()
For backup or shutdown, call:
Search and query¶
Search returns no rows¶
Check:
- vectors were inserted and committed;
- the query vector has the correct dimension;
kis greater than zero;- the
wherefilter is not too restrictive; - rows were not soft-deleted.
Useful inspection:
print(collection.shape)
print(collection.stats())
print(collection.list_deleted_ids())
print(collection.search_profile(query, k=5, where="tenant = 'acme'"))
query() returns empty results¶
This is expected when no filter is provided:
Pass a filter or explicit IDs:
Metadata is missing from search results¶
Set return_fields=True:
Or fetch fields after search:
Filter syntax is wrong¶
Common valid filters:
where = "lang = 'en'"
where = "rank >= 10 AND rank < 20"
where = "published = true"
where = "tags CONTAINS 'vector'"
where = "created_at >= '2026-06-01'"
where = "\"document.lang\" = 'en'"
See the Metadata filter cookbook for more examples.
Indexes¶
IVF index build fails or behaves badly¶
Pass n_clusters:
Then tune nprobe at search time:
If recall is low, compare with FLAT-L2, increase nprobe, or use fewer
clusters.
ANN results differ from expected exact neighbors¶
Approximate indexes trade recall for speed. Rebuild a flat baseline:
Then tune the ANN index:
nprobe appears to do nothing¶
nprobe controls IVF and HNSW search breadth. Flat, PQ, RaBitQ, PolarVec, and
named vector-field searches may ignore it.
Binary index scores seem reversed¶
Hamming and Jaccard are lower-is-better distances. Inner product and cosine are higher-is-better scores.
Named and sparse vectors¶
Adding named vectors fails¶
Check:
- the named field exists;
- vector dimension matches the field dimension;
- IDs already exist in the primary collection;
- the number of vectors equals the number of IDs.
collection.create_vector_field("image", dim=512, metric="l2")
collection.add_named_vectors("image", image_vectors, ids=image_ids)
Sparse vector search fails¶
Sparse feature IDs must be non-negative integers and weights must be numeric:
collection.add_sparse_vectors([{10: 1.0, 42: 0.5}], ids=[1])
collection.search_sparse({42: 1.0}, k=10)
HTTP server limits¶
Request rejected because it is too large¶
Lower your client batch size:
Or increase server limits:
lynse serve \
--data-dir ./server-data \
--json-limit-mb 512 \
--payload-limit-mb 1024 \
--max-batch-vectors 200000
k or batch size is rejected¶
Check:
--max-top-k;--max-batch-vectors;--max-collection-vectors;--max-collection-vector-bytes.
Set a limit to 0 only when you intentionally want to disable that guardrail.
Deletes and compaction¶
Deleted rows still take disk space¶
Deletes are soft deletes:
Physically remove tombstoned rows during maintenance:
After compaction, rows cannot be restored from the collection itself.
Deleted row appears again¶
It may have been restored:
Inspect tombstones:
Backups¶
Snapshot path is not where expected¶
In local mode, snapshot and export paths are on the Python process filesystem. In remote mode, they are on the server filesystem.
For remote mode, ./app.snapshot is relative to the server process working
directory.
Restore confidence is low¶
Run a restore drill:
client.restore_database("app_restore_test", "./app.snapshot", overwrite=True)
test_db = client.get_database("app_restore_test")
print(test_db.show_collections_details())
Quick diagnostic script¶
def inspect_collection(collection):
print("shape:", collection.shape)
print("stats:", collection.stats())
print("index:", collection.index_mode)
print("fields:", collection.list_fields())
print("vector_fields:", collection.list_vector_fields())
print("deleted:", collection.list_deleted_ids()[:10])
print("max_id:", collection.max_id)
Use this before changing data or rebuilding indexes. It gives you a quick view of the collection state.