当前位置:网站首页>Intel experts share: how to program efficiently on XPU architecture? Zhiqiang Research Institute
Intel experts share: how to program efficiently on XPU architecture? Zhiqiang Research Institute
2022-07-19 04:11:00 【Intel edge computing community】

From audio and video 、 The image processing , To AI Deep learning reasoning and training , In many different application scenarios , The amount of data processed is increasing exponentially , Data forms become more diverse , The requirements for computing processing chips are becoming more and more diverse 、 More and more complex .
For such a difficult challenge , Intel has long been 2018 It was proposed in XPU The concept of : Use a variety of computing architectures to fully meet the needs of complex computing . say concretely , Is by scalar (Scalar)、 vector (Vector)、 matrix (Matrix)、 Space (Spatial) Composed of SVMS framework , They correspond to each other CPU、GPU、 Accelerator and FPGA, It can be combined with a variety of heterogeneous processors , So as to achieve high-performance processing of a variety of loads .
With so many chips with different architectures , How to coordinate software ? How can developers efficiently develop applications across multiple computing architectures ?
For this hot topic ,6 month 29 Japanese “ Xeon Research Institute ” In the thought sharing meeting , Liu Yun, an architect of high-performance computing solutions from Intel Corporation, deeply analyzed XPU The concept of and its programming model .
● “X” There's an acceleration hardware , ●
4 Major mainstream programming technologies
At the beginning of speech sharing , Teacher Liu Yun is introducing XPU Point out :XPU It is a cross architecture processing unit (Cross-Architecture Processing Units) For short . And one of the “X”, Can be understood as a wildcard , Collectively, various acceleration and processing chips , Meet different application loads and power consumption requirements . Of course , Accelerated hardware with different architectures will bring higher complexity , Usually different development technologies are needed , It will also bring challenges to developers in terms of data correctness .

For heterogeneous programming , Current mainstream programming technologies include :
OpenMP: Use #pragma directives ( Guidance statement ) Development library for Parallel Computing . Mainstream C、C++ and Fortran Compilers can support . It is from 4.0 Version begins to support heterogeneous , But parallel code is not flexible , And the grammar is becoming more and more complex .
OpenCL: Only a single source code can support heterogeneous hardware , but OpenCL It's more difficult to learn , The learning curve is steeper .
CUDA、HIP etc. : Only in specific XPU Up operation .
SYCL And oneAPI ecology : Based on standards C++ Programming templates , There is more library support (oneAPI). Regardless of the target device , The language and programming model are consistent .
● Cross platform abstraction SYCL, ●
The new generation of heterogeneous programming
From Intel oneAPI It's an open 、 standard 、 Unified software stack across architectures and vendors , Its biggest characteristic is : For different brands 、 Different kinds of hardware , Abstract out the differences of various hardware , Provide a single source programming solution , It is convenient for developers to develop applications across a variety of computing architectures .
oneAPI Not only support PyTorch、TensorFlow、NumPy Wait for the upper Development Library , contain Python、SYCL、Fortran、C++ And other development language libraries , Provide performance monitoring 、 tuning 、 Debugging and other convenient tools for developers , It also has media processing 、 Image rendering 、 Mathematical calculation 、 Deep neural network 、 parallel processing 、 Network communication and other basic libraries . in general , Its advantage is :
One 、 Can be more sufficient 、 More efficient use of hardware performance ;
Two 、 Support developers , Let them use the programming model more freely .
As a new generation of heterogeneous platform programming family ,SYCL Namely oneAPI The cross platform abstraction layer in : It allows developers to use standards ISO C++ Develop code for parallel data processing , And across different CPU、GPU、FPGA and AI The accelerator provides a unified programming language and API Programming interface (API) . Developers only need to learn once , You can program for different accelerators . Regardless of the target device , The language and programming model are consistent .SYCL 2020 The final version of the specification is of far-reaching significance to the industry , because C++ Developers can finally use efficient cross XPU Build a unified programming model , To build high-performance heterogeneous applications .
For many years, Intel has always supported open source ecosystem and community , And follow open source SYCL standard , Provides the basis for SYCL The grammatical oneMKL、oneDPL Wait for math library and other development enhancement libraries , These open source libraries are also available in Intel GPU Up operation .
● Open source 、 Open model , ●
Empower the digital economy with technological strength
It is worth mentioning that ,oneAPI and SYCL It has been used to support high-performance computing scenarios . Argonne National Laboratory, USA “ Aurora ” (Aurora) supercomputer , Will use the next generation Intel To the strong Scalable processor ( code-named “Sapphire Rapids”) And the next generation HPC and AI Applied Intel Data Center GPU ( code-named “Ponte Vecchio”), It can provide double precision peak computing performance of more than 20 billion times per second . To get rid of a single architecture 、 Single vendor programming model locking ,“ Aurora ” Use oneAPI Support high performance computing 、 Artificial intelligence / Machine learning and big data analytics workloads , Reduce the need to maintain a separate code base and multiple programming languages , Thus, portable and efficient scientific computing is realized , Meet the needs of different tools and workflows .
In addition to these two expected XPU product , Intel is still this year 2 month 17 It was disclosed at the annual investors' meeting on the th XPU Next step planning of product blueprint , One of the most attractive concepts is “ A chip itself can also be XPU”. So , Intel uses next-generation packaging 、 Memory and I/O technology , Plan in 2024 Launched in Falcon Shores New processor architecture , It will Intel x86 CPU and Xe GPU The hardware is integrated into the same chip , It is expected that 5 More than times the performance per watt and 5 Times the memory capacity of the current platform , For computing large data sets and training huge AI The model system provides significant performance and efficiency improvements .

Teacher Liu Yun finally shared an exciting news : Newly released TensorFlow 2.9 edition , It's already built in oneAPI Medium oneDNN Interface , You can make full use of Intel To the strong The built-in vectorization acceleration capability of the scalable processor , Users do not need to modify the application code , At Intel To the strong Significantly improved on the scalable platform AI application , Especially the application performance of deep neural network .
in general , All the efforts made by Intel , All for the purpose of exporting from technical capabilities , Practice in the industry , To provide general 、 An open programming model based on high industry standards , Release the underlying hardware performance while reducing software development and maintenance costs .
“ Xeon Research Institute ” Activities of the spa , It is also necessary to empower developers with specific technical capabilities and practical experience in various industries , So as to promote innovation in the era of digital economy . If you want to know more about technology and experience sharing , Welcome to your attention “ Xeon Research Institute ” Thought sharing meeting .
“ Xeon Research Institute ” Thought sharing meeting
Intel High Performance Computing elite group community activity , Invite Intel experts and partner experts to hold small online sharing meetings every month , Share hot topics , And interact with group members .
6 month , We invite Mr. Liu Yun, an architect of Intel high performance computing solutions , Share with us 《XPU Preliminary exploration of programming model 》.

边栏推荐
- Mathematical modeling learning (67): detailed introduction to xgboost classification model case tutorial
- 结构体通过成员变量获取主结构体地址(struct)
- [database] must know and be able at the end of the term ----- Chapter 11 concurrency control
- 巧用企业网盘收取报告或总结
- 小程序毕设作品之微信电子书阅读小程序毕业设计(8)毕业设计论文模板
- C# 字符串(string)常用方法
- 【黄啊码】MySQL入门—5、数据库小技巧:单个列group by就会,多个列呢?
- 【数据库】期末必知必会-----第七章 数据库完整性
- Container adapter - stack, queue, priority queue
- Wechat online education video on demand learning applet graduation project (4) opening report
猜你喜欢

机器学习09:无监督学习

Chapter 3 performance platform godeye source code analysis - memory module

FTXUI基础笔记(botton按钮组件基础)

可省近90%服务器,反欺诈效率却大增,PayPal打破「AI内存墙」的方案为何如此划算?

2022 Yangtze River Delta mathematical modeling: Gearbox Fault Diagnosis

Chapter 5 performance platform godeye source code analysis - third party module
![[MySQL] install and configure MySQL on the ECS and connect with idea](/img/27/75b4c818941509fc935f35e617eeee.png)
[MySQL] install and configure MySQL on the ECS and connect with idea

小程序毕设作品之微信电子书阅读小程序毕业设计(1)开发概要

How to use Google Earth client and KML Download

小程序毕设作品之微信电子书阅读小程序毕业设计(4)开题报告
随机推荐
[MySQL] install and configure MySQL on the ECS and connect with idea
Mathematical modeling learning (67): detailed introduction to xgboost classification model case tutorial
Laradock restart MySQL found
Small program completion work wechat online education video on demand learning small program graduation design (2) small program function
7.16 simulation summary
Realize the dual opening of wechat on the computer (log in to two wechat)
C# 使用this关键字串联构造函数调用方法
ASP.NET1==visual studio创建asp.net demo
Openresty as a static resource server
XDC 2022 Intel 技术专场:英特尔软硬件技术构筑云计算架构基石
Modify jar package content
小程序毕设作品之微信电子书阅读小程序毕业设计(8)毕业设计论文模板
How to filter viruses / spam more effectively!
Use flink1.14 to operate iceberg0.13
[super cloud terminal to create a leading opportunity] local computing cloud management, Intel helps digitalize Education
库函数的模拟实现
Wechat online education video on demand learning applet graduation project (4) opening report
PAC Decade: witness HPC from CPU era to XPU Era
软件测试-进阶篇
Academic sharing | design and development of multi staining pathological image information evaluation system based on openvino