Allows seamless interconnection with multi-source heterogeneous data to support massive distributed data storage.
The system supports mainstream distributed computing frameworks, such as Hadoop, achieving million-level throughput, and supports horizontal expansion.
Built-in multiple self-developed visualized algorithm tools and equipped with convenient feature tracking tool with solution debugging.
With stateless internal micro-service and built-in network load balancer, creating a system that has high availability and horizontal scaling.
Isolation of tenant and user rights with shared workspace, comprehensive installation, deployment, operation and maintenance tools.
Unified Access for Multi-Source Heterogeneous Data
Provides unified access engine for structured data and unstructured data, such as images and texts, in order to establish seamless connection with mainstream data warehouses and relational databases; supports composite data access in AI application scenarios, including tagging image datasets, samples and models.
Time Series Data Grouping Management
Through data sharding storage technology, data sharding can be performed at a certain time field, and time slicing can be used as a basis for obtaining data, thereby achieving a quick positioning of data shards required by the model in the machine learning scenario.
Unified Governance of the Whole Domain Data
Data Platform provides a unified governance framework and meta-information management for heterogeneous data. The isomorphic data is integrated through the data group, isolated through the data domain, and could support sub-businesses and sub-scenario data management and comprehensively improve the data governance level in large-scale AI application process in horizontal and vertical dimensions.
Data Lifecycle Positioning and Tracking
The unified locator (prn) builds a global identity system for enterprise data, enabling quick locating and tracking of the changes in the lifecycle of the data.
Model Group and Model Version
Solved the management problems caused by having multiple versions in one model in the self-learning process through the model group. Model version management allows positioning different model snapshots of the same scenario.
Model Production Information Summary
Display of the relevant assessment reports during the model production is helpful for an accurate and comprehensive assessment of the model.
Visualized and Interpretable Model
The visualized and interpretable function of the visual and easy-to-understand model can not only help modeling engineers to analyze and optimize the model, but also provide an important channel for business personnel to understand the working principle of the model, creating a more transparent and controllable application of the model.
Compatible with Multiple Model Formats
Supports the import/export of open source (software) and third-party models, version management and online deployment, etc., in order to allow enterprises to achieve a one-stop model asset management, accumulate business value and improve management efficiency.
Enterprise-class Data Permission System
The tenant data is isolated from the user-level data to protect the individual data space of the enterprise from theft and contamination; unit data is shared in the workspace so as to achieve cross-department data multiplexing and knowledge transfer.
Visualized Task Management Panel
Supports retrieval of tasks by scenario, label, name, etc.; a centralized identification system, with a unified positioning of tasks and data.
Distributed Data Task Scheduling System
A distributed task management engine to uniformly schedule data jobs; a one-stop management of offline batch processing and real-time stream processing.