当前位置:网站首页>[cann training camp] Introduction to basic knowledge of shengteng AI
[cann training camp] Introduction to basic knowledge of shengteng AI
2022-07-19 13:37:00 【Hua Weiyun】
1. Rise AI Full stack architecture

1.1 Rise AI The four major parts of the whole stack
Application enablers , This layer usually contains the software and hardware used to deploy the model , for example API、SDK、 Deployment platform , Model library and so on .
AI The framework level , This layer contains a training framework for building models , For example, Huawei's MindSpore、TensorFlow、Pytorch etc. .
Heterogeneous computing architecture , Partial bottom 、 General calculation framework , For the upper layer AI The call of the framework is accelerated , Strive to support a variety of AI frame , Speed up on hardware .
Computing hardware , This floor is AI Calculated base , With powerful chips and hardware equipment , The acceleration of the upper level has the basis for implementation .
2. Heterogeneous computing architecture CANN
2.1 CANN Abstract five tier architecture
Huawei faces computer vision 、 natural language processing 、 Recommendation system 、 Robot like and other fields have been customized based on “ Vinci (DaVinci) framework ” The rise of (Ascend)AI processor , Start the journey of intelligence . In order to improve the efficiency of user development and release the rise AI Processor surging computing power , Synchronous launch for AI Heterogeneous computing architecture of scenario CANN(Compute Architecture for Neural Networks),CANN By providing a multi-level programming interface , With the whole scene 、 Low threshold 、 Advantages of high performance , Support users to quickly build Ascend Platform AI Application and business .
Rise AI Heterogeneous computing architecture (Compute Architecture for Neural Networks,CANN) It is abstracted into a five tier architecture , As shown in the figure below .

1. Shengteng computing language interface
Shengteng computing language (Ascend Computing Language,AscendCL) Interface is an open programming framework for shengteng computing , It is the encapsulation of the low-level shengteng computing service interface . It provides Device( equipment ) management 、Context( Context ) management 、Stream( flow ) management 、 memory management 、 Model loading and execution 、 Operator loading and execution 、 Media data processing 、Graph( chart ) Management etc. API library , For users to develop artificial intelligence applications to call .
2. Shengteng computing service layer
This layer mainly provides shengteng computing library , For example, neural networks (Neural Network,NN) library 、 Linear algebra calculation library (Basic Linear Algebra Subprograms,BLAS) etc. ; Shengteng computing tuning engine library , For example, operator tuning 、 Subgraph tuning 、 Gradient tuning 、 Model compression and AI Framework adapter .
3. Shengteng computing compilation engine
This layer mainly provides graph compiler (Graph Compiler) and TBE(Tensor Boost Engine) Operator development support . The former inputs the user into the intermediate expression (Intermediate Representation,IR) The calculation diagram is compiled into NPU Running model . The latter provides the tools that users need to develop custom operators .
4. Shengteng computing execution engine
This layer is responsible for the execution of models and operators , Provide such as runtime (Runtime) library ( Perform memory allocation 、 Model management 、 Data sending and receiving, etc )、 Figure actuator (Graph Executor)、 Digital vision preprocessing (Digital Vision Pre-Processing,DVPP)、 Artificial intelligence preprocessing (Artificial Intelligence Pre-Processing,AIPP)、 Huawei collective communication library (Huawei Collective Communication Library,HCCL) And so on .
5. The basic layer of shengteng computing
This layer mainly provides basic services for the upper layers , Such as shared virtual memory (Shared Virtual Memory,SVM)、 Device virtualization (Virtual Machine,VM)、 host - Device communication (Host Device Communication,HDC) etc. .
2.2 CANN Three layer logical architecture

1. application layer
Including based on Ascend Various applications developed by the platform , as well as Ascend Provide users with algorithm development 、 Application tools for tuning .
1. Application of reasoning
be based on AscendCL Provided API Build reasoning applications
2. AI frame
Include TensorFlow、Caffe、MindSpore And third-party frameworks
3. Model miniaturization tool
Realize the quantification of the model , Acceleration model
4. AutoML Tools
be based on MindSpore Automatic learning tools , Search according to the characteristics of shengteng chip to generate an affinity network , Give full play to the rising performance
5. Acceleration Library
be based on AscendCL Built acceleration Library ( The current support Blas Acceleration Library )
6. MindStudio
Integrated development environment and debugging tools for developers , Can pass MindStudio Perform offline model conversion 、 Offline reasoning algorithm application development and debugging 、 Algorithm debugging 、 Custom operator development and debugging 、 Log view 、 performance tuning 、 System fault check, etc
2. Chip enablers
Realize the opening of solutions to the outside world , And the control and operation of business flow based on calculation graph .
1. AscendCL Shengteng computing language library
Open programming framework , Provide Device/Context/Stream/ Memory management 、 Loading and execution of models and operators 、 Media data processing 、Graph Management etc. API library , For users to develop deep neural network applications .
2. Graph optimization and compilation
A unified IR Interfaces are connected to different front ends , Support TensorFlow/Caffe/MindSpore Analysis of the expressed calculation diagram / Optimize / compile , Provide the ability to optimize the deployment of back-end computing engines
- Graph Engine: Figure control center for compilation and operation
- Fusion Engine: Management operator fusion rule
- AICPU Engine:AICPU Operator information management
- HCCL:HCCL Operator information management
3. Operator compilation and operator Library
- TBE: Compile and generate operators and operator development tools
- Operator Library : Neural network acceleration library
4. Digital vision preprocessing
Realize video coding and decoding (VENC/VDEC)、JPEG codec (JPEG/E)、PNG decode (PNGD)、VPC( Preprocessing )
5. Execution engine
- Runtime: Provide resource management channel for task allocation of neural network
- Task Scheduler: Calculation chart Task Sequence management and scheduling 、 perform
3. Computing resource layer
It mainly realizes the data processing and data operation execution of the system .
1. Computing equipment
- AI Core: perform NN Class operator
- AI CPU: perform CPU operator
- DVPP: video / Image codec 、 Preprocessing
2. communication link
- PCIe: Between chips or between chips and CPU High speed interconnection between
- HCCS: Realize the cache consistency function between chips
- RoCE: Realize chip memory RDMA function
3. Shengteng computing language interface AscendCL
3.1 AscendCL brief introduction
AscendCL(Ascend Computing Language, Shengteng computing language ) It is the open programming framework of shengteng computing , It is the encapsulation of the underlying shengteng computing service interface , It provides runtime resources ( For example, equipment 、 Memory, etc. ) management 、 Model loading and execution 、 Operator loading and execution 、 Picture data encoding and decoding / tailoring / Zoom processing, etc API library , Realize the rising CANN Deep learning reasoning calculation on the platform 、 Graphics and image preprocessing 、 The ability of single operator to accelerate calculation . Simply speaking , It's unified API frame , Realize the call to all resources .

3.2 AscendCL The advantages of
1. Highly abstract : Operator compilation 、 load 、 Executive API Come back , Compared to one for each operator API,AscendCL A sharp decrease API Number , Reduce complexity .
2. Backward compatibility :AscendCL Backward compatibility , Ensure that after the software upgrade , Programs compiled based on the old version can still run on the new version .
3. Zero sense chip : A set of AscendCL The interface can realize the unification of application code , There is no difference between multiple Pentium processors .
3.3 AscendCL Main application scenarios of
1. Development and application : The user can call AscendCL Provide the interface to develop image classification application 、 Target recognition application, etc .
2. For the third-party framework to call : Users can call... Through a third-party framework AscendCL Interface , In order to use shengteng AI The computing power of the processor .
3. For third party development lib library : Users can also use AscendCL Encapsulate and implement third-party lib library , In order to provide rise AI Operation management of processor 、 Resource management, etc .
3.4 AscendCL The layering ability of is open
AscendCL It provides management and control of layered and open capabilities , Different enabling components are docked through different components . contain GE Open ability 、 count
Sub capacity opening 、Runtime Open ability 、Driver Open ability, etc .
- Model loading capability is open : Handle om Model loading , But the opening of the interface is through AscendCL.
- Operator ability is open : The operator ability is realized in CANN in , But the operator ability is open through AscendCL.
- Runtime Open ability : The processing is based on stream Equipment capability 、 Memory 、event And other resource capacity development demands , Yes app Shield the underlying implementation .
边栏推荐
- VMware imports ova/ovf virtual machine files
- Panasonic A6 servo driver external absolute value grating ruler full closed loop parameter setting
- [pyGame learning notes] 7 event
- JVM self study summary
- Flutter 使用 AnimatedSwitcher 做场景切换
- Flutter uses animatedswitcher to switch scenes
- 【7.14】代码源 -【拆方块】【XOR Inverse】【连续子序列】【三角果计数】
- Framework construction of business card management
- 565.数组嵌套
- Li Kou 70 - climbing stairs - Dynamic Planning
猜你喜欢

CMOS switch learning (I)

codeforce:A. Doremy‘s IQ【反向贪心】

Onvif protocol related: 2.1.3 get the stream address in none mode

onvif协议相关:4.1.3 WS-Username token方式获取截图url

Wrong again, byte alignment and the use of pragma pack

力扣70-爬楼梯——动态规划

Advanced C language -- character function and string function

【码蹄集新手村 600 题】计算一个整数有多少位数

JVM self study summary
![Codeforce:a. difference operations [mathematical thinking]](/img/be/28bcb5dd8b9a36f2955f1912f289a3.png)
Codeforce:a. difference operations [mathematical thinking]
随机推荐
A general memory management driver code is sorted out
Array simulation queue
【码蹄集新手村 600 题】输出时的左对齐,右对齐
[pumpkin Book ml] (task2) mathematical derivation of linear model (least squares estimation, generalized Rayleigh quotient, maximum likelihood estimation, etc.)
Programming examples of stm32f1 and stm32subeide -mpu-6050 six axis (gyroscope + accelerometer) drive
LeetCode 0118. Yanghui triangle
Onvif protocol related: 3.1.1 digest access authorization
VMware imports ova/ovf virtual machine files
如何在MFC中添加一个线程
【7.12】Codeforces Round #806 (Div. 4)
云审计服务CTS是一项付费服务,付费项目包括开通追踪器、事件跟踪以及7天内事件的存储和检索等相关费用
[JS reverse crawler] - Youdao translation JS reverse practice
【码蹄集新手村 600 题】如何使整数逆序
codeforce:A. Difference Operations【数学思维】
ONVIF Protocol Related: 4.1.3 WS - username token Method get capture d'écran URL
[record of question brushing] 13 Roman numeral to integer
如何优雅的升级 Flink Job?
模板虚拟机环境准备
onvif协议相关:4.1.4 WS-Username token方式获取流地址
Panasonic A6 servo driver external absolute value grating ruler full closed loop parameter setting