This chapter introduces how to create vector index, which is used for vector similarity search. If not created, similarity search cannot be performed in Hippo. The following example creates an IVF_FLAT index with 10 cluster centers and uses Euclidean distance (L2) as main similarity metric.
curl -u shiva:shiva -XPUT 'localhost:8902/hippo/v1/{table}/_create_embedding_index?database_name={database_name}&pretty' -H 'Content-Type: application/json' -d'{
"field_name" : "book_intro",
"index_name" : "ivf_flat_index",
"metric_type" : "l2",
"index_type": "IVF_FLAT",
"params": {
"nlist" : 10
}
}';
Result:
{
"acknowledged" : true
}
Parameter description:
Parameters | Description | Options |
---|---|---|
table | Table name, such as "book" created in this example | |
database_name | Database where the destination table is located | |
field_name | Vector column where the vector index is to be created | |
index_name | Vector index name | |
metric_type | Vector similarity metric type used to measure similarities among vectors | |
index_type | Vector index type | |
params | Vector index parameter, related to vector index type |
For some indexes, like PQ and SQ, sacrificing performance for compression ratio improvement, Hippo provides “refine” related parameter “index_slow_refine” in params additionally when creating vector index, enhancing recall rate effectively through increasing the number of returned records (topk). Here is an example of building index:
curl -u shiva:shiva -XPUT 'localhost:8902/hippo/v1/{table}/_create_embedding_index?pretty' -H 'Content-Type: application/json' -d'
{
"field_name": "vec",
"index_name": "ivf_sq_index",
"metric_type": "L2",
"index_type": "ivf_sq",
"params": {
"nlist": 512,
"sq_type": "SQ8",
"index_slow_refine": "true"
}
}';
Parameter description:
Parameters | Description | Options |
---|---|---|
index_slow_refine (params) | Improves recall rate, mainly used for PQ, SQ | Defaults to "false" |
Table 36 Create Vector Index – with refine parameter (Restful API)