Quick Start¶
LynseDB version is: 0.1.5
Initialize Database¶
LynseDB now supports HTTP API and Python native code API.
The HTTP API mode requires starting an HTTP server beforehand. You have two options: - start directly.
For direct startup, the default port is 7637. You can run the following command in the terminal to start the service:
- within Docker
In Docker, You can run the following command in the terminal to start the service:
- Remote deployIf you want to deploy remotely, you can bind the image to port 80 of the remote host, or allow the host to open access to port 7637. such as:
- test if api available
You can directly request in the browser http://localhost:7637
For port 80, you can use this url: http://localhost
If the image is bound to port 80 of the host in remote deployment, you can directly access it http://your_host_ip
Server running at http://127.0.0.1:7637
create a collection¶
WARNING
When using the require_collection
method to request a collection, if the drop_if_exists
parameter is set to True, it will delete all content of the collection if it already exists.
A safer method is to use the get_collection
method. It is recommended to use the require_collection
method only when you need to reinitialize a collection or create a new one.
show database collections¶
If the pandas library is installed, show_collections_details
method will show as a pandas dataframe. Otherwise, it will be a dict.
dim | chunk_size | dtypes | use_cache | n_threads | warm_up | description | cache_chunks | |
---|---|---|---|---|---|---|---|---|
test_collection | 4 | 100000 | float32 | True | 10 | False | demo collection | 20 |
update description¶
dim | chunk_size | dtypes | use_cache | n_threads | warm_up | description | cache_chunks | |
---|---|---|---|---|---|---|---|---|
test_collection | 4 | 100000 | float32 | True | 10 | False | Hello World | 20 |
Add vectors¶
When inserting vectors, the collection requires manually running the commit
function or inserting within the insert_session
function context manager, which will run the commit
function in the background.
It is strongly recommended to use the insert_session
context manager for insertion, as this provides more comprehensive data security features during the insertion process.
2024-09-12 17:33:36 - LynseDB - INFO - Task status: {'status': 'Processing'}
2024-09-12 17:33:38 - LynseDB - INFO - Task status: {'result': {'collection_name': 'test_collection', 'database_name': 'test_db'}, 'status': 'Success'}
Find the nearest neighbors of a given vector¶
The default similarity measure for query is Inner Product (IP). You can specify cosine or L2 to obtain the similarity measure you need.
ids: [ 9 5 10]
scores: [ 0.18610001 -0.16069996 -0.23799998]
fields: [{':id:': 9, 'field': 'test_3', 'order': 8}, {':id:': 5, 'field': 'test_2', 'order': 4}, {':id:': 10, 'field': 'test_1', 'order': 9}]
List data¶
ids: [1 2 3 4 5]
scores: [[0.01 0.34 0.74000001 0.31 ]
[0.36000001 0.43000001 0.56 0.12 ]
[0.03 0.04 0.1 0.50999999]
[0.11 0.44 0.23 0.23999999]
[0.91000003 0.43000001 0.44 0.67000002]]
fields: [{':id:': 1, 'field': 'test_1', 'order': 0}, {':id:': 2, 'field': 'test_1', 'order': 1}, {':id:': 3, 'field': 'test_2', 'order': 2}, {':id:': 4, 'field': 'test_2', 'order': 3}, {':id:': 5, 'field': 'test_2', 'order': 4}]
ids: [ 6 7 8 9 10]
scores: [[0.92000002 0.12 0.56 0.19 ]
[0.18000001 0.34 0.56 0.70999998]
[0.01 0.33000001 0.14 0.31 ]
[0.70999998 0.75 0.91000003 0.81999999]
[0.75 0.44 0.38 0.75 ]]
fields: [{':id:': 6, 'field': 'test_3', 'order': 5}, {':id:': 7, 'field': 'test_1', 'order': 6}, {':id:': 8, 'field': 'test_2', 'order': 7}, {':id:': 9, 'field': 'test_3', 'order': 8}, {':id:': 10, 'field': 'test_1', 'order': 9}]
Use FieldExpression for result filtering¶
ids: [2 7 1]
scores: [-0.35749996 -0.39020002 -0.39859998]
fields: None
Use Filter for freer conditional expression¶
Using the Filter class for result filtering can maximize Recall.
The Filter class now supports must
, any
, and must_not
parameters, all of which only accept list-type argument values.
The filtering conditions in must
must be met, those in must_not
must not be met.
After filtering with must
and must_not
conditions, the conditions in any
will be considered, and at least one of the conditions in any
must be met.
The filter result must satisfy both must
and any
, but not must_not
.
ids: [2 7 1]
scores: [-0.35749996 -0.39020002 -0.39859998]
fields: None
Query fields¶
Query via FieldExpression¶
[{':id:': 1, 'field': 'test_1', 'order': 0},
{':id:': 2, 'field': 'test_1', 'order': 1},
{':id:': 7, 'field': 'test_1', 'order': 6}]
Query via Filter¶
[{':id:': 1, 'field': 'test_1', 'order': 0},
{':id:': 2, 'field': 'test_1', 'order': 1},
{':id:': 7, 'field': 'test_1', 'order': 6}]
Exact Match¶
[{':id:': 1, 'field': 'test_1', 'order': 0}]
Fuzzy Match¶
[{':id:': 1, 'field': 'test_1', 'order': 0},
{':id:': 2, 'field': 'test_1', 'order': 1},
{':id:': 10, 'field': 'test_1', 'order': 9},
{':id:': 7, 'field': 'test_1', 'order': 6}]
Query Vectors¶
Much like query, you can query using either the FieldExpression string or the Filter class, fuzzy match, or exact match.
(array([1, 2, 7]),
array([[0.01 , 0.34 , 0.74000001, 0.31 ],
[0.36000001, 0.43000001, 0.56 , 0.12 ],
[0.18000001, 0.34 , 0.56 , 0.70999998]]),
[{':id:': 1, 'field': 'test_1', 'order': 0},
{':id:': 2, 'field': 'test_1', 'order': 1},
{':id:': 7, 'field': 'test_1', 'order': 6}])
(array([ 1, 2, 7, 10]),
array([[0.01 , 0.34 , 0.74000001, 0.31 ],
[0.36000001, 0.43000001, 0.56 , 0.12 ],
[0.18000001, 0.34 , 0.56 , 0.70999998],
[0.75 , 0.44 , 0.38 , 0.75 ]]),
[{':id:': 1, 'field': 'test_1', 'order': 0},
{':id:': 2, 'field': 'test_1', 'order': 1},
{':id:': 7, 'field': 'test_1', 'order': 6},
{':id:': 10, 'field': 'test_1', 'order': 9}])
Drop a collection¶
WARNING: This operation cannot be undone
Collection list before dropping: ['test_collection']
Collection list after dropped: []
Drop the database¶
WARNING: This operation cannot be undone
RemoteDatabaseInstance(name=test_db, exists=False)