Skip to content

Tutorial: Troubleshooting

This page maps common LynseDB symptoms to likely causes and fixes.

Installation

Native Windows does not work

Native Windows environments are not supported. Use one of:

  • WSL 2 with Linux Python;
  • Docker server mode;
  • a Linux or macOS environment.

Import fails

Check the Python version:

python --version

LynseDB requires Python 3.9 or newer.

Reinstall in the active environment:

python -m pip install -U LynseDB

Connection

Cannot connect to remote server

Start the server:

lynse serve --host 127.0.0.1 --port 7637 --data-dir ./server-data

Check the root endpoint:

curl http://127.0.0.1:7637/

Then connect:

client = lynse.VectorDBClient("http://127.0.0.1:7637")

Make sure the client URL includes http:// or https://.

Authentication failed

If the server was started with --api-key, pass the same key:

client = lynse.VectorDBClient(
    "http://127.0.0.1:7637",
    api_key="your_key",
)

Raw HTTP requests need:

curl -H "Authorization: Bearer your_key" http://127.0.0.1:7637/list_databases

Public endpoints are /, /healthz, and /readyz. Other endpoints require auth when an API key is configured.

Local data path is shared by multiple processes

Do not let independent writer processes share the same local root path. Use HTTP server mode so one process owns the data directory:

lynse serve --host 0.0.0.0 --port 7637 --data-dir ./server-data

Database and collection setup

Database does not exist

Use create_database() once, then get_database() later:

db = client.create_database("app")
db = client.get_database("app")

Inspect names:

print(client.list_databases())

Collection does not exist

Create or open:

collection = db.require_collection("docs", dim=768)

Open only if it exists:

collection = db.get_collection("docs")

Inspect names:

print(db.show_collections())

Data disappeared after a test

Look for drop_if_exists=True:

client.create_database("app", drop_if_exists=True)
db.require_collection("docs", dim=768, drop_if_exists=True)

These flags are destructive and should be limited to tests or explicit reset scripts.

Ingestion

Vector dimension error

The collection dimension is fixed:

collection = db.require_collection("docs", dim=768)

Every inserted primary vector must have length 768:

vector = np.asarray(vector, dtype=np.float32)
assert vector.shape == (768,)

For a different embedding model dimension, create a different collection or a named vector field with that dimension.

Duplicate ID error

add_item() and bulk_add_items() expect new IDs. Use upsert to replace or insert by ID:

collection.upsert_item(vector, id=123, field={"title": "updated"})
collection.commit()

Check existing IDs:

print(collection.is_id_exists(123))
print(collection.max_id)

Writes are not visible as expected

Use insert_session() or call commit():

with collection.insert_session() as session:
    session.add_item(vector, id=1)

# or
collection.add_item(vector, id=2, buffer_size=False)
collection.commit()

For backup or shutdown, call:

collection.checkpoint()

Search and query

Search returns no rows

Check:

  • vectors were inserted and committed;
  • the query vector has the correct dimension;
  • k is greater than zero;
  • the where filter is not too restrictive;
  • rows were not soft-deleted.

Useful inspection:

print(collection.shape)
print(collection.stats())
print(collection.list_deleted_ids())
print(collection.search_profile(query, k=5, where="tenant = 'acme'"))

query() returns empty results

This is expected when no filter is provided:

collection.query()
collection.query_vectors()

Pass a filter or explicit IDs:

collection.query(where="tenant = 'acme'")
collection.query(filter_ids=[1, 2, 3])

Metadata is missing from search results

Set return_fields=True:

result = collection.search(query, k=10, return_fields=True)

Or fetch fields after search:

result = collection.search(query, k=10)
rows = collection.query(filter_ids=result.ids.tolist())

Filter syntax is wrong

Common valid filters:

where = "lang = 'en'"
where = "rank >= 10 AND rank < 20"
where = "published = true"
where = "tags CONTAINS 'vector'"
where = "created_at >= '2026-06-01'"
where = "\"document.lang\" = 'en'"

See the Metadata filter cookbook for more examples.

Indexes

IVF index build fails or behaves badly

Pass n_clusters:

collection.build_index("IVF-L2", n_clusters=256)

Then tune nprobe at search time:

collection.search(query, k=10, nprobe=20)

If recall is low, compare with FLAT-L2, increase nprobe, or use fewer clusters.

ANN results differ from expected exact neighbors

Approximate indexes trade recall for speed. Rebuild a flat baseline:

collection.build_index("FLAT-L2")
baseline = collection.search(query, k=10)

Then tune the ANN index:

collection.build_index("HNSW-L2")
candidate = collection.search(query, k=10, nprobe=64)

nprobe appears to do nothing

nprobe controls IVF and HNSW search breadth. Flat, PQ, RaBitQ, PolarVec, and named vector-field searches may ignore it.

Binary index scores seem reversed

Hamming and Jaccard are lower-is-better distances. Inner product and cosine are higher-is-better scores.

Named and sparse vectors

Adding named vectors fails

Check:

  • the named field exists;
  • vector dimension matches the field dimension;
  • IDs already exist in the primary collection;
  • the number of vectors equals the number of IDs.
collection.create_vector_field("image", dim=512, metric="l2")
collection.add_named_vectors("image", image_vectors, ids=image_ids)

Sparse vector search fails

Sparse feature IDs must be non-negative integers and weights must be numeric:

collection.add_sparse_vectors([{10: 1.0, 42: 0.5}], ids=[1])
collection.search_sparse({42: 1.0}, k=10)

HTTP server limits

Request rejected because it is too large

Lower your client batch size:

collection.bulk_add_items(items, batch_size=1000, enable_progress_bar=False)

Or increase server limits:

lynse serve \
  --data-dir ./server-data \
  --json-limit-mb 512 \
  --payload-limit-mb 1024 \
  --max-batch-vectors 200000

k or batch size is rejected

Check:

  • --max-top-k;
  • --max-batch-vectors;
  • --max-collection-vectors;
  • --max-collection-vector-bytes.

Set a limit to 0 only when you intentionally want to disable that guardrail.

Deletes and compaction

Deleted rows still take disk space

Deletes are soft deletes:

collection.delete_items([1, 2, 3])
collection.commit()

Physically remove tombstoned rows during maintenance:

removed = collection.compact()
print(removed)

After compaction, rows cannot be restored from the collection itself.

Deleted row appears again

It may have been restored:

collection.restore_items([1])

Inspect tombstones:

print(collection.list_deleted_ids())

Backups

Snapshot path is not where expected

In local mode, snapshot and export paths are on the Python process filesystem. In remote mode, they are on the server filesystem.

client.snapshot_database("app", "./app.snapshot")

For remote mode, ./app.snapshot is relative to the server process working directory.

Restore confidence is low

Run a restore drill:

client.restore_database("app_restore_test", "./app.snapshot", overwrite=True)
test_db = client.get_database("app_restore_test")
print(test_db.show_collections_details())

Quick diagnostic script

def inspect_collection(collection):
    print("shape:", collection.shape)
    print("stats:", collection.stats())
    print("index:", collection.index_mode)
    print("fields:", collection.list_fields())
    print("vector_fields:", collection.list_vector_fields())
    print("deleted:", collection.list_deleted_ids()[:10])
    print("max_id:", collection.max_id)

Use this before changing data or rebuilding indexes. It gives you a quick view of the collection state.