Introduction
Overview
Unstructured data management is more important than ever due to the rise of big data. Managing and gleaning business value from unstructured data is of utmost importance to enterprises today. Advancements in machine learning, as well as deep learning, technologies now enable organizations to efficiently address unstructured data and improve quality assurance efforts.
In the field of artificial intelligence or machine learning, embeddings and vector databases have become increasingly important for tackling a wide range of problems. These techniques are used to represent data in a compact, high-dimensional vector space, which can then be manipulated and analyzed more easily.
Transwarp Hippo (“Hippo”) is a proprietary enterprise cloud native distributed vector database, supporting storing, indexing and managing massive vector datasets, delivering accelerated solutions for many areas, such as vector similarity search and clustering of dense vectors. Hippo ensures high availability, high performance and easy scale-out/in, supports vector search index, and delivers a set of functionalities including data sharding, partitioning, data persistence, incremental data ingestion, vector/scalar filtering in hybrid search, enabling enterprises to perform real-time query, search, and candidate generation against massive vector data.
Hippo Architecture
Diagram
Major Components
Components | Functionalities | Brief Introduction |
---|---|---|
TDDMS | Transwarp distributed data management system | TDDMS is a proprietary enterprise distributed data management system. It achieves strong / eventual consistency among replicas and best-matching distribution for data. Besides, TDDMS can automatically manage data redistribution when performing scale-out on storage and ensure data availability without interrupting on-going data storage services when one of storage hardware breaks down. |
Vector Engine | Vector search engine | Vector Engine is a proprietary search engine designed by Transwarp. It supports vector search on massive data and similarity search with high accuracy and high performance. |
Sophon Model Cube | Model cube | Sophon Model Cube (“SMC”) unifies all stages of LLM lifecycle, including model release, evaluation and deployment. SMC supports managing multimodal LLMs with high maintainability and operability, supports image model, text model and hybrid model, and enables model evaluation and model experience, achieving the maximum value of LLM models. |
Roles in Hippo
- Hippo Master
- Also called Shiva Master
- Stores metadata of TDDMS
- Hippo Master is a Raft group, which has multiple Master nodes
- 3 or 5 Master nodes are recommended
- Hippo Tablet Server
- Also called Shiva Tablet Server
- Stores TDDMS data
- At least 5 Tablet Server nodes are recommended
- Hippo Webserver
- Built-in lightweight monitoring component in Hippo, which is used for monitoring cluster/index status
- Provides REST API GUI
- At least one Webserver is required
- Hippo HTTP Server
- Handles HTTP Server requests
- Supports integrating with Python/Java/HTTP
- At least one HTTP Server is recommended
Advantages
Advantages | Details |
---|---|
Cloud native system | Hippo is deployed based on our proprietary cloud native operating system, which enables the strong abilities of scale out/in, multi-tenancy and resource management. |
Distributed deployment | Hippo supports distributed deployment, ensures strong consistency via Raft algorithm, and enables failover and data recovery. |
Multi-model architecture | Hippo can integrate with other services deployed in Transwarp Data Hub (“TDH”) platform to achieve federated search. |
High performance search | Parallel search can be well realized due to the multiprocessing architecture and GPU acceleration supported by Hippo. Furthermore, multiple indexes, specific performance tuning techniques for search speed and memory usage and algorithm optimization at register level are also achieved for organizations to perform analytics against different business scenarios. |
Multiple APIs integration | Hippo currently supports Python, Restful, Java API. |
Management Components
Components | Functionality | Brief Introduction |
---|---|---|
Transwarp Aquila | Intelligent operation and maintenance analysis platform | Aquila is a one-stop platform deployed in TDH for cluster monitoring, service monitoring, and database query monitoring. Aquila provides an integrated O&M portal for each of data platforms, offering security audit, log retrieval, performance monitoring, alert warning, online operation and maintenance, root cause analysis and other functions. |
Transwarp Manager | Big data management platform | Manager is a component specifically used to deploy, manage, and operate Hippo clusters, as well as other services deployed in Manager. It supports one-click installation, one-click upgrade and graphical operation and maintenance of products, and provides health detection function to help users simplify the operation and maintenance process. |
Transwarp Cloud Operating System | Cloud native OS | TCOS is a proprietary cloud operating system designed based on Docker and Kubernetes. It offers unified resource scheduling framework. With container orchestration, TCOS can perform unified scheduling on compute, storage, network and other resources. |
Updated 8 months ago