当前位置:网站首页>Dhfs read / write process
Dhfs read / write process
2022-07-19 01:45:00 【Hyf 。】
Catalog
Two 、 Node distance calculation
3、 ... and 、 Replica node selection
23 God
Interview focus
One 、HDFS Write data flow
flow chart :( The picture comes from Shang Silicon Valley )

Flow chart analysis :
1、 Client pass Distributed FileSystem Module to NameNode Request file upload ,NameNode Check if the destination file already exists , Does the parent directory exist .
2、NameNode Returns whether you can upload .
3、 The client requests the first one Block Which ones to upload DataNode Server .
4、NameNode return 3 individual DatatNode, Respectively dn1、dn2、dn3.
5、 Client pass FSDataOutputStream Module request dn1 Upload data ,dn1 Upon receipt of the request, the call continues dn2, then dn2 call dn3, Set up the communication channel .
6、dn1、dn2、dn3 Step by step reply client .
7、 The client starts going dn1 Upload the first one Block( The data is first read from disk and put into a local memory cache ), With Packet In units of ,dn1 Receive a Packet Will be passed on to dn2,dn2 Pass to dn3;dn1 Each one packet A reply queue is placed waiting for the reply .
8、 When one Block Once the transmission is complete , The client requests again NameNode Upload the second Block Service for device .( repeat 3-7 Step ).
Two 、 Node distance calculation
Node distance : The sum of the distances of two nodes to the nearest common ancestor .
Read data flow chart :( From Shang Silicon Valley )

3、 ... and 、 Replica node selection
1、 The first copy is in Client On the same node . If the client is outside the cluster , Choose one at random .
2、 The second copy is at a random node in another rack
3、 The third replica is at a random node in the rack where the second replica is located .
Reason for copy selection : The node is closest , The fastest upload speed ; Ensure the reliability of the data ;
Four 、HDFS Read data flow
The following figure is from shangsilicon Valley

If too much data is read, other nodes will be accessed , Let other nodes read data
1、 Client pass DistributedFileSystem towards NameNode Request file download .NameNode By querying metadata . Locate the file block DataNode Address .
2、 Choose a DataNode( Follow the principle of proximity , Then a random ) The server , Request read data .
3、DataNode Start transferring data to client ( Reads the data input stream from the disk , With packet I'm going to do the check for units ).
4、 The client to packet Is unit reception , Cache locally first , Then write to the target file .
边栏推荐
猜你喜欢
随机推荐
Introduction to software vulnerability analysis (II)
NFT排行榜-NFT实盘最新地址:NFT排行榜.COM
6 寻找比目标字母大的最小字母
The interviewer asked: how to check if redis suddenly slows down?
Nmap and Nikto scanning
Applet swiper height
软件漏洞分析入门(四)
面试官问:Redis 突然变慢了如何排查?
Introduction to software vulnerability analysis (III)
Scala环境搭建
06 BTC mining difficulty
Nodejs cross domain CORS
Deep copy and shallow copy
波卡生态中“中继链”、“DOT”的常见问题解答
知名啤酒百威布局NFT,试图揭开“蓄谋已久”的上链面纱?
Byte two side: what is pseudo sharing? How to avoid it?
CheckPoint and DataNode
axs火爆,还有哪些打金游戏(上)
nmap和nikto扫描
7 矩阵中战斗力最弱的 K 行

![[AHU2021校内赛] ez-injection](/img/44/644c27f86cbbc9c6630249d2111066.png)






