当前位置:网站首页>ONEFLOW source code list: GDB compilation and debugging
ONEFLOW source code list: GDB compilation and debugging
2022-07-18 16:28:00 【ONEFLOW deep learning framework】

author | Wang Yi 、 Yan Hao
translate | Chenghaoyuan 、 Dongwenwen
1
GDB Python3
PyTorch Officially released how to use GDB Yes Python The trigger C++ A guide to debugging code , Details refer to :
https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md#gdb-integration
Its core idea is to run gdb python3. stay GDB In the session , Can be given C++ Function name sets breakpoint , Such as at::Tensor::neg.GDB This function cannot be found at this time ,prompt Will prompt whether to suspend the breakpoint when the shared library is loaded , answer yes. Then input run,GDB It will start Python Interpreter .Python The interpreter will prompt for Python Source code . Input import torch, And then go back .
When Python The interpreter performs import When the sentence is , Relevant shared libraries will be loaded .GDB Will monitor the load and set breakpoints . perform Python Source code , Trigger breakpoint , Then open the GDB prompt Conduct C++ debugging , For example, using bt Check backtracking , Use l Show Python Called C++ Code .
2
Compile in debug mode OneFlow
Linux System
OneFlow Support Linux, Temporary does not support macOS and Windows. This article mainly introduces in AWS GPU Run on main engine Amazon Linux 2( Be similar to CentOS).
(base) [wkyi ~]$ cat /etc/os-releaseNAME="Amazon Linux"VERSION="2"ID="amzn"ID_LIKE="centos rhel fedora"VERSION_ID="2"PRETTY_NAME="Amazon Linux 2"ANSI_COLOR="0;33"CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"HOME_URL="https://amazonlinux.com/"
Conda or Docker Environmental Science
OneFlow Official documents suggest that Conda or Docker Mirror image :
https://github.com/Oneflow-Inc/oneflow#option-1-build-with-conda-recommended. This operation uses Anaconda. Use Conda or Docker To fix it C++ Compiler and other build tool chain versions . Use the new version of g++ Need to update the source code , Such as https://github.com/Oneflow-Inc/oneflow/issues/8397.
Compile debug version
Pay attention here , You have to compile OneFlow Debug version of , because GDB Debugging symbols are required to make bt and l The output of is meaningful .
cd ~/w/oneflow/buildCMAKE_BUILD_TYPE=Debug cmake .. -C ../cmake/caches/international/cpu.cmake
I installed CPU Version of OneFlow, Created cpu.cmake file . Because of my AWS The host is not in China , So it's in international Create files in directory .
Report errors
In case of installation error , I am here GitHub Submitted relevant issue
(https://github.com/Oneflow-Inc/oneflow/issues?q=is%3Aissue+author%3Awangkuiyi),OneFlow The R & D personnel of gave a quick response , Salute them !
Compilation steps
This section will show compilation OneFlow Specific steps :
1. Download and install Anaconda. The default installation path is ~/anaconda3. Add environment variables to ~/.bashrc. then , Get the environment variables or reconnect the host to make the changes take effect .
2. Create and activate Conda Environmental Science , Please refer to : https://github.com/Oneflow-Inc/conda-env
3. Git clone Source code
mkdir ~/w cd ~/w git clone https://github.com/Oneflow-Inc/oneflow
4. compile OneFlow
cd oneflow mkdir build cd build CMAKE_BUILD_TYPE=Debug cmake .. -C ../cmake/caches/international/cpu.cmake make -k -j $(nproc)
Operation and commissioning
After installed , stay ~/w/oneflow/build There will be source.sh file , This file sets PYTHONPATH Environmental Science . Run the following command to make the settings take effect .
source source.shthen , use GDB function Python Editor .
gdb python3stay GDB prompt in , I am here oneflow::one::Tensor::is_eager Set a breakpoint , The breakpoint will be suspended when the shared library is loaded .
(gdb) b oneflow::one::Tensor::is_eagerFunction "oneflow::one::Tensor::is_eager" not defined.Make breakpoint pending on future shared library load? (y or [n]) yBreakpoint 1 (oneflow::one::Tensor::is_eager) pending.
And then , Input run function Python Editor . stay Python prompt in , Input oneflow.
(gdb) runStarting program: /home/wkyi/anaconda3/envs/oneflow-dev-gcc7-v2/bin/python3Missing separate debuginfos, use: debuginfo-install glibc-2.26-58.amzn2.x86_64[Thread debugging using libthread_db enabled]Using host libthread_db library "/lib64/libthread_db.so.1".Python 3.7.10 (default, Feb 26 2021, 18:47:35)[GCC 7.3.0] :: Anaconda, Inc. on linuxType "help", "copyright", "credits" or "license" for more information.>>> import oneflow
The import time will be longer than usual . If it shows ImportError, To check whether it is running source source.sh.
Next, you can create tensor 了 .
>>> a = oneflow.tensor(1)Thread 1 "python3" hit Breakpoint 1, oneflow::one::CopyBetweenMirroredTensorAndNumpy<long> (t=..., [email protected]=0x7fffe5905150, Copy=<optimized out>,[email protected]=0x7fffefc977e0 <oneflow::BlobNumpyCopyUtil<long>::From(unsigned long, oneflow::NumPyArrayPtr const&)>, modifier=...,[email protected]=false) at /home/wkyi/w/oneflow/oneflow/api/python/utils/tensor_utils.h:9898 CHECK_OR_RETURN(tensor->is_eager()) << "eager tensors supported only.";
Enter enter enter , This line of code will trigger a breakpoint .
The above information is indicated in the... Of the source file 98 Line has a name tensor->is_eager() Function of oneflow::one::CopyBetweenMirroredTensorAndNumpy.
To display more , You can enter l. In the 98 That's ok , Called tensor->is_eager().
(gdb) l93 inline Maybe<void> CopyBetweenMirroredTensorAndNumpy(94 const std::shared_ptr<Tensor>& t, PyObject* array,95 Maybe<void> (*Copy)(uint64_t, const NumPyArrayPtr&), const std::string& modifier,96 bool block_host_until_done) {97 auto tensor = JUST(t->AsMirroredTensor());98 CHECK_OR_RETURN(tensor->is_eager()) << "eager tensors supported only.";99100 if (block_host_until_done) {101 NumPyArrayPtr array_ptr(array);102 const auto& Callback = [array_ptr, Copy](uint64_t ofblob_ptr) {
You might wonder , Why is it Python Created in tensor Will trigger right Tensor::is_eager Call to ? You can enter bt To display more information .
(gdb) bt#0 oneflow::one::CopyBetweenMirroredTensorAndNumpy<long> (t=..., [email protected]=0x7fffe5905150, Copy=<optimized out>,[email protected]=0x7fffefc977e0 <oneflow::BlobNumpyCopyUtil<long>::From(unsigned long, oneflow::NumPyArrayPtr const&)>, modifier=...,[email protected]=false) at /home/wkyi/w/oneflow/oneflow/api/python/utils/tensor_utils.h:98#1 0x00007fffefd5aa5c in oneflow::one::CopyMirroredTensorFromUntypedArray<long> (array=0x7fffe5905150, tensor=...)at /home/wkyi/w/oneflow/oneflow/api/python/utils/tensor_utils.cpp:61#13 0x00007fffefbe433f in oneflow::one::functional::tensor (self=<optimized out>, args=<optimized out>, kwargs=<optimized out>)at /home/wkyi/w/oneflow/build/oneflow/api/python/functional/tensor_api.yaml.pybind.cpp:96#14 0x00005555556b98b4 in _PyMethodDef_RawFastCallKeywords () at /tmp/build/80754af9/python_1614362349910/work/Objects/call.c:693#15 0x00005555556b99d1 in _PyCFunction_FastCallKeywords (func=0x7ffdc75675a0, args=<optimized out>, nargs=<optimized out>, kwnames=<optimized out>)at /tmp/build/80754af9/python_1614362349910/work/Objects/call.c:732#29 0x000055555578c22c in _Py_UnixMain () at /tmp/build/80754af9/python_1614362349910/work/Modules/main.c:3495#30 0x00007ffff783113a in __libc_start_main () from /lib64/libc.so.6#31 0x0000555555730e90 in _start () at ../sysdeps/x86_64/elf/start.S:103
At the bottom of the call stack is _stack, It is Python Entry point of editor . As you can see from the above code Python and OneFlow Call boundaries between shared libraries ——Python Medium _PyMethodDef_RawFastCallKeywords The function is called OneFlow Of C++ function oneflow::one::functional::tensor, And then triggered the right oneflow::one::Tensor::is_eager Call to

{"version": "0.2.0","configurations": [{"type": "cppdbg","request": "launch","name": "GDB","program": "/home/charlieyan/anaconda3/envs/oneflow-dev-gcc7-v2/bin/python"cwd": ".","environment": [{"name": "PYTHONPATH","value": "/home/charlieyan/proj/oneflow/python"}]}]}

2.https://of-worldwide.quip.com/JuQ0AuodVJn4/Use-GDB-to-Walkthrough-OneFlow-Source-Code
( This article is compiled and released after authorization )
The illustration OneFlow Learning rate adjustment strategy of
OneFlow The source code parsing : Automatic inference of operator signature
Hinton: My 50 years of in-depth study career and Research on mental skills
LLVM The father of : Why should we rebuild AI Infrastructure software
Quantitative model of parallel computing and its application in deep learning engine

This article is from WeChat official account. - OneFlow(OneFlowTechnology).
If there is any infringement , Please contact the [email protected] Delete .
Participation of this paper “OSC Source creation plan ”, You are welcome to join us , share .
边栏推荐
- Several calling methods of Oracle stored procedure
- idea Gradle7.0+ :Could not find method compile()
- 安装无线网卡驱动
- Atcoder beginer contest 259 partial solution
- Insight into the puzzle of database, 2022 Jincang innovative product launch was held
- Learning notes - DC motor governor
- Elk service of elk cluster deployment (10)
- Halcon distance calculation
- 索引的原理与设计原则
- HCIP - PPP/HDLC与GRE/MGRE实验
猜你喜欢
随机推荐
LeetCode腾讯精选练习50题-054.螺旋矩阵
OSPF的不规则区域
Sword finger offer 09 Implementing queues with two stacks
Address assignment of global variables, local variables, static variables and constants
Atcoder beginer contest 259 partial solution
Tutorial on the principle and application of database system (021) -- database operation of MySQL
Elk cluster deployment (II) deployment kibana
CF514B Han Solo and Lazer Gun
考NPDP有什么好处
English语法_定冠词the_小细节
PC web page, mobile terminal adaptation, page font size dynamic change
CF609A USB Flash Drives
最全攻略合集!超强AI作画工具 Midjourney 开放公测!快来构建你的幻想元宇宙!...
Add value to health and empower the times | xianle Health releases the annual Sustainable Development Report
思科CCNP认证介绍
工业级知识图谱de构建与应用(二):商品知识的表示和建模
LeetCode腾讯精选练习50题-059.螺旋矩阵II
高等数学---第八章隐函数偏导数与全微分
[QNX Hypervisor 2.2用户手册]8.2 Guest退出
dat.gui控件自定义放置位置及拖拽





![[today's little go is going to throw away the garbage (3)]](/img/d4/52ec88d9566ffff3d10f31a489fbbd.png)

