当前位置:网站首页>Use iceberg in CDP to pressurize the data Lake warehouse
Use iceberg in CDP to pressurize the data Lake warehouse
2022-07-18 02:28:00 【Big data grocery store】
We are pleased to announce that in Cloudera Data platform (CDP) Fully launched in Apache Iceberg.Iceberg yes 100% Open table format , from Apache Software Foundation Development , Help users avoid supplier lock . Today's general availability announcement covers Cloudera Data platform (CDP) Key data services in Iceberg, Include Cloudera Data warehouse ( CDW )、Cloudera Data engineering ( CDE ) and Cloudera machine learning ( CML )). These tools enable analysts and data scientists to easily collaborate on the same data through the tools and analysis engines they choose . As CDP Part of , The company can get Iceberg The benefits of . No longer locked 、 Unnecessary data transformation or data movement across tools and the cloud , Just to extract insight from the data . As the first hybrid data platform to provide open data warehouse ,CDP Support for streaming data and storage data in cloud native object storage across multiple clouds and local clouds PB Level multifunctional analysis . This allows our customers to freely choose the analysis tools they like . rely on Cloudera Vision for mixed data , Enterprises adopting open data silos can easily obtain application interoperability and portability between local environment and any public cloud , Without worrying about data expansion . Built in from the beginning CDP Sharing data experience in (SDX), Customers can get common metadata from all data 、 Benefit from security and governance models .
1. Why would you Apache Iceberg And Cloudera Data platform integration ?
stay Cloudera, Our commitment to openness and interoperability is unequivocal . This drives us to Apache Hive、Apache Spark、Apache Nifi、Apache Impala、Apache YuniKorn And other communities have made many significant contributions to innovation .2022 year 2 month , We are CDP Introduced in Apache Iceberg As a technical preview . In the past decade ,Cloudera By introducing Hive Table format and Hive ACID It realizes the multi-functional analysis of the data Lake .Lakehouse The model has developed to cloud , however , It is still driven by the tabular format associated with the main engine , Usually a single supplier . On the other hand , Companies continue to need highly scalable and flexible analysis engines and services on the data Lake , Without the limitation of suppliers . Organizations need modern data architectures that can develop with the development of business , We are happy to support them through the first open data warehouse . Apache Iceberg Now as CDP Part of is included , It brings significant benefits to modern data architecture , Include :
Perform on the spot and change , Covering schema and partition changes , As a single command , Instead of a week-long process Time travel through time point query , For forensics visibility and compliance Concurrent multifunctional analysis can meet the requirements from the edge to AI End to end data lifecycle requirements performance : Improve performance by actively partitioning to handle very large datasets

2. CDP Provides access to Iceberg The fastest 、 The simplest way
We will Iceberg Integrate directly into CDP Of SDX Layer , Therefore, customers can easily use Iceberg And immediately get all the productivity and performance benefits of the open table format . Customers use metadata only migration in a single command , Without touching any underlying large data sets . This is the giant accelerator used .
3. Pressurize your data warehouse , Make it open
Data Lake warehouse for Cloudera Or our customers are no strangers . for example , IQVIA Use Cloudera Will come from all over the world 250 A data warehouse ( Include Oracle、IBM Netezza and Teradata System ) More than 2 PB Data is gathered into a global multi tenant data Lake , And run the analysis on the data Lake .IQVIA Use Hive Open table format and Cloudera The pre integrated multifunctional analysis platform of has been for more than five years . But the current data lake house architecture model is not enough . We see that companies need a platform that spans the entire data lifecycle , The platform can provide multiple advanced analysis use cases , It contains complete dynamic data and operational database products . This is the open data warehouse , Only Cloudera It can be provided in the hybrid data platform .

With the help of CDP Medium Apache Iceberg,Cloudera With open data and community ecosystem, as well as enterprise strengthening and performance, it leads the data warehouse . Our technical preview customers shared the following feedback :
Teranet:“ After evaluating all major open source storage frameworks to build our Lakehouse after , We chose Apache Iceberg, Because it 100% to open up 、 Rich functions and strong community participation . There is now a Iceberg,CDP Support open data warehouse architecture , This architecture provides a future oriented data platform for all our analysis workloads . We choose change data capture as our role in Iceberg The first use case on . By frequently updating our data Lake , Our goal is to accelerate reporting and business intelligence , Give our business team access to current insights . Partition evolution is also a key capability for us , For large-scale data engineering and BI Workloads provide superior query performance ,”Teranet System architect Steve Brackenbury say . Modak Nabu:“Modak And Cloudera Our cooperation enables us to help our customers deploy a unified data Lakehouse framework , Also for any analysis use case ( Artificial intelligence 、 machine learning 、SQL、 Business Intelligence Report 、 Dashboard and more . By using Cloudera Of CDP Iceberg Table format pair Modak Nabu authentication , Enterprise customers can accelerate the delivery of any data PB Level data ingestion 、 Management and consumption , Thus simplifying data management and faster data access ,”Daniel Mantovani say , Innovation director modak analysis . Customer pass CDP Make full use of the partition evolution function , And by using finer grained partitions on its data , Realized 10 More than times the query performance advantage . They can do this , There is no need to regenerate or modify any basic data . We are right. Apache Iceberg The integration of CDP Beyond the ability of data warehouse . We can process any data anywhere , Including mixed clouds and cloudy . We are born in your data 、 Login and use the local work .
Original author :Bill Zhang, Shaun Ahmadian, and Cloudera Contributors Link to the original text :https://blog.cloudera.com/supercharge-your-data-lakehouse-with-apache-iceberg-in-cloudera-data-platform/
Follow wechat public account for more information : 
边栏推荐
- Xu Shiwei: la voie de l'évolution de go +.
- 解决安装oracle /usr/bin/ld: cannot find -lclntshcore的问题
- 【综合笔试题】难度 2/5,递归运用及前缀和优化
- [computer level 3 information security] time node memory test questions
- 卷到自己?继 Imagen 之后,推出200 亿文本生成的图像模型惊呆网友!
- Google 推荐在 MVVM 架构中使用 Kotlin Flow
- Nacos作为微服务架构的注册发现中心和配置中心
- 抖音推出“团购配送”,探索外卖新模式
- MySQL transaction
- 重空间轻安全,汉兰达和领克09,你选择谁
猜你喜欢

龙蜥社区招募推广大使&体验官啦!| 人人都可以参与开源

Kotlin Sealed 是什么?为什么 Google 都用

Alipay sandbox app login failed. There is no problem with the account

Kotlin 插件的落幕,ViewBinding 的崛起

【黑马早报】东方甄选CEO:董宇辉是总裁级主播;李亚鹏欠债4000万终审败诉;15家商业银行称停贷风险可控;红杉大幅减持美团...

Career masters help you with interview and job search + career development

Single vehicle management system - 1 Document design and SQL code description
![[NLP] deepke, an open source knowledge extraction tool that supports low resources, long chapters and multimodality](/img/65/b9fb57cbace49aef828b9267220cbf.png)
[NLP] deepke, an open source knowledge extraction tool that supports low resources, long chapters and multimodality

Tiktok launched "group purchase and distribution" to explore a new model of takeout
![[dark horse morning post] Dongfang selects CEO: dongyuhui is the president level anchor; Li Yapeng owed 40million yuan and lost the final judgment; 15 commercial banks said the risk of loan suspension](/img/58/8d5c78d919ed60bc833ec4daa22e23.jpg)
[dark horse morning post] Dongfang selects CEO: dongyuhui is the president level anchor; Li Yapeng owed 40million yuan and lost the final judgment; 15 commercial banks said the risk of loan suspension
随机推荐
Svn download and Chinese package installation
The cake can be cut across the board. Is the R & D demand OK| Agile practice
Solve the problem of installing Oracle /usr/bin/ld: cannot find -lclntshcore
Crmeb Pro v1.4 makes the user experience more brilliant!
How to choose the appropriate automated testing tools?
#微信小程序# #uni-app# 使用uni.getUserProfile,实现微信授权登录 (附源码)
2022-04-18 C# 第4篇——进阶
配置Sublime的C语言环境
[question 018: how does unity understand quaternion.angleaxis?]
第8章 委托、lambda表达式和事件
2022-04-21 Unity基础2——MonoBehaviour的重要内容
QT连接MySQL
three.js无限跑动VR小游戏
MySQL batch add test data
实验1.SQL Server的安全机制
41:第四章:开发文件服务:2:FastDFS:(2):准备两个虚拟机,然后创建两个FastDFS基础环境;一台虚拟机上配置tracker服务;另一台虚拟机上配置storage服务;
解决安装oracle /usr/bin/ld: cannot find -lclntshcore的问题
【NLP】一个支持低资源、长篇章、多模态的开源知识抽取工具——DeepKE
558. 四叉树交集 : 简单递归运用题
Experiment 3 Course selection system