Scalar quantization (“SQ”) is a data compression technique that converts floating point values into Int4/6/8/16 integers, which can help improve the computational performance and decrease the memory usage.

If IVF_FLAT index is used, the vectors are stored without any compression, and the memory usage of IVF-FLAT is the same as the one of original vector data. When memory is limited, IVF_SQ can be taken into consideration. IVF_SQ8 does scalar quantization for each vector placed in the unit based on IVF. Scalar quantization converts each dimension of the original vector from a 4-byte floating-point number to a 1-byte unsigned integer, so the IVF_SQ8 index file occupies much less space than the IVF_FLAT index file.

Building parameters:

VectorsTypesRange
nlistNumber of cluster units[1,65536]
Table 11 Building Parameter of IVF_SQ

Search parameters:

VectorsTypesRange
nprobeNumber of units to query[1, nlist]
Table 12 Search Parameter of IVF_SQ