Create Vector Index

This chapter introduces how to create vector index, which is used for vector similarity search. If not created, similarity search cannot be performed in Hippo. The following example creates an IVF_FLAT index with 10 cluster centers and uses Euclidean distance (L2) as main similarity metric.

curl -u shiva:shiva -XPUT 'localhost:8902/hippo/v1/{table}/_create_embedding_index?database_name={database_name}&pretty' -H 'Content-Type: application/json' -d'{
  "field_name" : "book_intro",
  "index_name" : "ivf_flat_index",
  "metric_type" : "l2",
  "index_type": "IVF_FLAT",
  "params": {
    "nlist" : 10
  }
}';

Result:

{
  "acknowledged" : true
}

Parameter description:

ParametersDescriptionOptions
tableTable name, such as "book" created in this example
database_nameDatabase where the destination table is located
field_nameVector column where the vector index is to be created
index_nameVector index name
metric_typeVector similarity metric type used to measure similarities among vectors
  • L2 (Euclidean distance)
  • IP (Inner product)
  • index_typeVector index type
  • FLAT
  • IVF_FLAT
  • IVF_SQ
  • IVF_PQ
  • IVF_PQ_FS
  • HNSW
  • paramsVector index parameter, related to vector index type
    Table 25 Create Vector Index (Restful API)

    For some indexes, like PQ and SQ, sacrificing performance for compression ratio improvement, Hippo provides “refine” related parameter “index_slow_refine” in params additionally when creating vector index, enhancing recall rate effectively through increasing the number of returned records (topk). Here is an example of building index:

    curl -u shiva:shiva -XPUT 'localhost:8902/hippo/v1/{table}/_create_embedding_index?pretty' -H 'Content-Type: application/json' -d'
    {
      "field_name": "vec",
      "index_name": "ivf_sq_index",
      "metric_type": "L2",
      "index_type": "ivf_sq",
      "params": {
        "nlist": 512,
        "sq_type": "SQ8",
        "index_slow_refine": "true"
      }
    }';
    

    Parameter description:

    ParametersDescriptionOptions
    index_slow_refine (params)Improves recall rate, mainly used for PQ, SQDefaults to "false"

    Table 36 Create Vector Index – with refine parameter (Restful API)