Sage Data Platform
AI Data Governance Platform

Product Value

Large Capacity

Allows seamless interconnection with multi-source heterogeneous data to support massive distributed data storage.

High Throughput

The system supports mainstream distributed computing frameworks, such as Hadoop, achieving million-level throughput, and supports horizontal expansion.

Low Threshold

Built-in multiple self-developed visualized algorithm tools and equipped with convenient feature tracking tool with solution debugging.

High Availability

With stateless internal micro-service and built-in network load balancer, creating a system that has high availability and horizontal scaling.

Enterprise-class

Isolation of tenant and user rights with shared workspace, comprehensive installation, deployment, operation and maintenance tools.

Product Functions

  • Data Integration
    Unified Access for Multi-Source Heterogeneous Data

    Provides unified access engine for structured data and unstructured data, such as images and texts, in order to establish seamless connection with mainstream data warehouses and relational databases; supports composite data access in AI application scenarios, including tagging image datasets, samples and models.

    Time Series Data Grouping Management

    Through data sharding storage technology, data sharding can be performed at a certain time field, and time slicing can be used as a basis for obtaining data, thereby achieving a quick positioning of data shards required by the model in the machine learning scenario.

  • Data Governance
    Unified Governance of the Whole Domain Data

    Data Platform provides a unified governance framework and meta-information management for heterogeneous data. The isomorphic data is integrated through the data group, isolated through the data domain, and could support sub-businesses and sub-scenario data management and comprehensively improve the data governance level in large-scale AI application process in horizontal and vertical dimensions.

    Data Lifecycle Positioning and Tracking

    The unified locator (prn) builds a global identity system for enterprise data, enabling quick locating and tracking of the changes in the lifecycle of the data.

  • Model Management
    Model Group and Model Version

    Solved the management problems caused by having multiple versions in one model in the self-learning process through the model group. Model version management allows positioning different model snapshots of the same scenario.

    Model Production Information Summary

    Display of the relevant assessment reports during the model production is helpful for an accurate and comprehensive assessment of the model.

    Visualized and Interpretable Model

    The visualized and interpretable function of the visual and easy-to-understand model can not only help modeling engineers to analyze and optimize the model, but also provide an important channel for business personnel to understand the working principle of the model, creating a more transparent and controllable application of the model.

    Compatible with Multiple Model Formats

    Supports the import/export of open source (software) and third-party models, version management and online deployment, etc., in order to allow enterprises to achieve a one-stop model asset management, accumulate business value and improve management efficiency.

  • Enterprise-class Features
    Enterprise-class Data Permission System

    The tenant data is isolated from the user-level data to protect the individual data space of the enterprise from theft and contamination; unit data is shared in the workspace so as to achieve cross-department data multiplexing and knowledge transfer.

    Visualized Task Management Panel

    Supports retrieval of tasks by scenario, label, name, etc.; a centralized identification system, with a unified positioning of tasks and data.

    Distributed Data Task Scheduling System

    A distributed task management engine to uniformly schedule data jobs; a one-stop management of offline batch processing and real-time stream processing.