当前位置:网站首页>Dataarts, a data governance production line, makes "everyone an analyst"
Dataarts, a data governance production line, makes "everyone an analyst"
2022-07-18 07:24:00 【Hua Weiyun】
What is data governance ?
Generally speaking , Traditional data governance refers to when 、 In what way 、 By whom 、 For which data 、 What actions to take . Traditional data governance focuses on “ cure ” and “ The reason is ”, Weaken “ Value creation ”.

Consulting design and governance implementation account for% of human input and project income 80%, Throughout the data governance process , There are three business pain points :
- It's a lot of manpower 、 High ability threshold : In the early stage, we rely on management experts to do consulting design , In the later stage, we rely on development experts for governance implementation . Governance experts should understand methodology 、 Industry experience 、 Data warehouse design ; Development experts should be proficient SQL\MR\Spark Development ;
- Long working mode cycle 、 Heavy process : Traditional data governance work is a waterfall workflow , First, we need to carry out detailed design , Then implement according to the design , Before and after strong dependence , Users need to redesign when they have new requirements , Not agile 、 Low efficiency ;
- The degree of data analysis is shallow : Developed data 90% Above or BI Analysis oriented , Lack of deep data mining .
therefore , The core demands of enterprises in data governance are three aspects :
- Reduce labor input : Increasingly complex 、 Massive data automation & Credibility digital transformation requirements must be based on AI Ability to achieve ;
- Lower the ability threshold : Data governance needs to become a work for everyone , Provide technical tools for data users ;
- Improve the working mode : Interactive recommendation data exploration mode directly for business personnel , The waterfall pipeline development mode can no longer meet the needs of agility .
Why do enterprises need to do data governance ?
With the deepening of digital transformation , As a core asset, data should drive business and release value , need :
- Can get in , All kinds of large 、 diversity 、 Real time data sources can be efficiently integrated ;
- Can hold , The cost performance of long-term storage of massive data is high , There is no need to do all kinds of mode conversion , Easy to analyze and calculate ;
- It can be sorted out , Modeling based on industry best practices , The relationship between the data is clearly visible , The meaning is easy to understand , Quality problems are found in time ;
- You can find it quickly , Quickly find what you need is data assets , Can quickly analyze the value ;
- It works well , Data value is explicit , Respond quickly to business needs , Drive business improvement .
However, to achieve the above goals , There are also three major challenges :
- Data governance is difficult : From data to assets supporting business , Traditional database 、 Technologies such as data warehouse modeling and knowledge atlas cannot meet the needs of enterprise business process analysis and decision-making , It is difficult to manage and analyze massive heterogeneous data , Well managed data is also difficult to effectively integrate with Applications ;
- Numerous systems and complex architectures : As the business grows , Many systems need to be managed, such as lakes 、 warehouse 、AI etc. ;
- High technical threshold : Most enterprises lack big data related personnel , R & D efficiency is low , Maintenance costs are high .
DataArts Automate data governance 、 Intelligent
at present , Data management production line DataArts There are abundant practices inside and outside Huawei . In the internal , Manage the production line based on data DataArts, Huawei produces 10 More than million high-quality data assets ; On the outside ,DataArts Yes 1000 More than government and enterprise customers , Tens of millions of data tasks run on the cloud every day .

Huawei cloud data governance production line DataArts It can help enterprises solve the challenges faced by data intelligence , Making data work , And can meet the core demands of enterprise data governance .
The so-called data production line , seeing the name of a thing one thinks of its function , Just like the production line , Put massive complex and disordered data , Production becomes a clean, transparent and high-quality data energy , To the business .
Huawei cloud data governance production line DataArts It can help enterprise data enter the lake in real time 、 analysis 、 Handle ; Have AI Ability to achieve intelligent data preparation and management ; Have full link data security management , Protect your private data , Conduct compliance audits of data usage ; Help enterprises to deposit data assets , Give full play to the value of data , Realize business innovation and development .
Simply speaking , Data management production line DataArts Changed the tradition “ People pull their shoulders against ” Data processing method , Help improve efficiency ; Lower the technical threshold , Give Way “ Everyone is an analyst ”; Give Way “ data ‘ Wisdom ’ speak ”, Drive efficient decision making .

Huawei data management production line DataArts New features start
▌ In the process of data entering the lake , Automatic metadata discovery and tabular storage
- Support OBS、HDFS/SFTP、Kafka、REST Wait for files on data storage 、 Message metadata is automatically discovered ;
- Custom classifier , Support CSV、JSON、 Text 、Parquet、ORC、Hudi Wait for semi-structured data Schema Automatic pattern inference and extraction ;
- Build table 、 Field 、 Partition , And perceive its changes and other metadata information , Easy to search data 、 Calculation and analysis .
▌ Intelligent enhanced AutoETL Ability , Improved data preparation efficiency 20%
- The fusion code Patterns and no-code Pattern : Support no-code Pattern development flow / Batch data processing job , The number of job nodes decreases 20%, The efficiency of data job development is reduced from days to hours / Minute level ;
- Rich data processing operator Library : Support cleaning 、 Filter 、 Merge 、Join And other data processing categories 10+, Number of operators 200+.
▌ Intelligent enhanced data anomaly detection , Improve data quality audit efficiency
- Through fuzzy index 、 Pattern mining and other methods to find potential duplicate data blocks ;
- Check the grammatical differences of data through similarity comparison , And the entity parsing of domain knowledge base to check the semantic differences of data ;
- Support real-time sampling calculation data quality preview , Support high-performance scanning to calculate data quality , Scanning speed of ten thousand meters is improved 5 times .
▌ Enterprise data directory , Search and manage data assets like search engines
- Enterprise data directory , For cloudy Region Unified data directory of logical data Lake , Technical metadata is automatically updated synchronously , And it is associated with business metadata and management metadata information ;
- Intelligent recommendation : Support natural semantic search , And give intelligent search suggestions 、 Asset recommendation and ranking ;
- 360 panorama “ Entity - Relationship ” Knowledge map , Automatically discover data connections . Smart navigation , Path analysis 、 Community analysis and other advanced graph analysis ,1W+ Dot graph analysis response time 200ms within .
▌ Full link data security protection , Centralized security policy governance , Intelligent identification of private data
- Centralized data security management , Support enterprises to achieve unified control of enterprise data security policies ;
- Intelligent data security , built-in GDPR Security rule base 、 Support data access control 、 Automatic identification of sensitive data , Intelligent data protection ( encryption 、 desensitization 、 watermark );
- Full link data security , Data integration 、 transmission 、 Storage 、 Data architecture design 、 The development of preparation 、 Asset search 、 Data security capabilities are integrated into all links such as open services .
边栏推荐
- AES encryption learning of openharmony security module
- Win10右键新建栏目中添加新建Markdown文件(Typora.md)
- HTAP能力加速TPC-H执行前要怎么部署PolarDB for PostgreSQL?
- 抽丝剥茧C语言(高阶)静态通讯录
- Makefile variable assignment
- ab网站压力测试
- The function of ifndef /define/endif in the header file
- LINQ implements dynamic orderby
- Dictionary tree (trie tree)
- 【示波器的基本使用】以及【示波器按键面板上各个按键含义的介绍】
猜你喜欢

用命令行登录并操作数据库

08 semi automatic annotation of target detection data set

Win10 right click the new column to add a new markdown file (typora.md)

编程练习

synchronized的特性与底层原理以及锁的状态和膨胀升级过程

MySQL MySQL Foundation

带图像识别的YYS连点器 V2.0

Intel releases open source AI Reference Suite

If you don't want to step on those holes in SaaS, you must first understand the "SaaS architecture"

防火墙HA配置
随机推荐
makefile变量赋值
If you don't want to step on those holes in SaaS, you must first understand the "SaaS architecture"
Openharmony module 2 file samgr_ Server parsing (3)
MODBUS-RS485布线的8条准则
美团一面面经及详细答案
OpenHarmony模块二interfaces下头文件解析(8)
COMS技术
The function of ifndef /define/endif in the header file
(戴尔灵越7572)笔记本外扩显示器以后,笔记本没有声音了的解决办法
Using chardet to detect web page coding
consul启动闪退
ASP.NET里的Session详细解释
编程练习
箭头函数的使用
模块二interfaces下头文件解析
如何通过psql导入TPC-H数据?
模块二interfaces下头文件解析(3)
Programming exercises
客户端的那些事儿
MySQL MySQL Foundation