当前位置:网站首页>Base64 encoding and decoding principle and C language implementation
Base64 encoding and decoding principle and C language implementation
2022-07-18 14:00:00 【hwd00001】
List of articles
Reference material :
1. Principle introduction materials , author : New perspective of procedure , article 《 A thorough understanding of an article Base64 Coding principle 》
2. Code reference , author :ssmile, article 《C Language implementation base64 Codec function 》
0.base64 The purpose of coding
Use the following 64 Printable characters , To represent byte stream ( The value can be 0-255). There is another supplementary character ‘=’.
“ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/”
1. Base64 The coding principle of
Base64 Encoding is to encode a string with each 3 individual 8 The bit (bit) The byte subsequence of is split into 4 individual 6 The bit (bit) Bytes of (6 Bit valid byte , Actually, too. 8 Bytes of bits , Only the leftmost two bits are always 0) Subsequence , Then find the obtained subsequence Base64 Coding index table , A coding method to get the corresponding characters spliced into a new string .
After the coding , Every time 3 Bytes become 4 Bytes , Increase the number of bytes by one third .
Let's use examples to illustrate :
1.1 example
The table in the following figure is an example , Let's analyze the whole process 
【 First step 】:“M”、“a”、"n" Corresponding ASCII The code values are respectively 77,97,110, The corresponding binary value is 01001101、01100001、01101110. As shown in the second and third lines of the picture , From this we form a 24 Bit binary string .
【 The second step 】: As shown in the red box , take 24 Bitwise 6 A group of bits is divided into four groups .
【 The third step 】: Add two in front of each group above 0, Expanded into 32 Binary bits , This becomes four bytes :00010011、00010110、00000101、00101110. The values corresponding to each of them (Base64 Coded index ) by :19、22、5、46.
Step four : Use the values above in Base64 Search in the encoding table , They correspond to each other :T、W、F、u. So strings “Man” After coding, it becomes :TWFu.
1.2 Not enough digits 3 In the case of bytes
The above is illustrated in terms of three bytes , If the number of bytes is less than three , So how to deal with ?
A byte : One byte of 8 Binary bits , Still grouped according to the rules . At this time, a total of 8 Binary bits , Every time 6 In groups , The second group lacks 4 position , use 0 A filling , Get two Base64 code , The latter two groups have no corresponding data , Use both “=” Fill up . therefore , Above picture “A” After conversion, it becomes “QQ==”;
Two bytes : Two bytes in total 16 Binary bits , Still grouped according to the rules . At this time, a total of 16 Binary bits , Every time 6 In groups , The third group lacks 2 position , use 0 A filling , Get three Base64 code , The fourth group has no data at all “=” Fill up . therefore , Above picture “BC” After conversion, it becomes “QKM=”;
c The source code of the language is as follows ( Basic copy from ssmile):
// base64 Conversion table , common 64 individual
static const char base64_alphabet[] = {
'A', 'B', 'C', 'D', 'E', 'F', 'G',
'H', 'I', 'J', 'K', 'L', 'M', 'N',
'O', 'P', 'Q', 'R', 'S', 'T',
'U', 'V', 'W', 'X', 'Y', 'Z',
'a', 'b', 'c', 'd', 'e', 'f', 'g',
'h', 'i', 'j', 'k', 'l', 'm', 'n',
'o', 'p', 'q', 'r', 's', 't',
'u', 'v', 'w', 'x', 'y', 'z',
'0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
'+', '/'};
static char cmove_bits(unsigned char src, unsigned lnum, unsigned rnum) {
src <<= lnum;
src >>= rnum;
return src;
}
int base64_encode( char *indata, int inlen, char *outdata, int *outlen) {
int ret = 0; // return value
if (indata == NULL || inlen == 0) {
return ret = -1;
}
int in_len = 0; // Source string length , If in_len No 3 Multiple , Then it needs to be supplemented with 3 Multiple
int pad_num = 0; // The number of characters to be completed , This is the only way 2, 1, 0(0 There's no need to splice , )
if (inlen % 3 != 0) {
pad_num = 3 - inlen % 3;
}
in_len = inlen + pad_num; // The length after splicing , The length of the actual encoding required (3 Multiple )
int out_len = in_len * 8 / 6; // Length after coding
char *p = outdata; // Define pointer to outgoing data The first address
// code , The length is the adjusted length , 3 A set of bytes
for (int i = 0; i < in_len; i+=3) {
int value = *indata >> 2; // take indata The first character moves to the right 2bit( discarded 2bit)
char c = base64_alphabet[value]; // Corresponding base64 Conversion table characters
*p = c; // Will correspond to the character ( Characters after encoding ) Assign a value to outdata First byte
// Deal with the last group ( Last 3 byte ) The data of
if (i == inlen + pad_num - 3 && pad_num != 0) {
if(pad_num == 1) {
*(p + 1) = base64_alphabet[(int)(cmove_bits(*indata, 6, 2) + cmove_bits(*(indata + 1), 0, 4))];
*(p + 2) = base64_alphabet[(int)cmove_bits(*(indata + 1), 4, 2)];
*(p + 3) = '=';
} else if (pad_num == 2) {
// The encoded data should be supplemented with two '='
*(p + 1) = base64_alphabet[(int)cmove_bits(*indata, 6, 2)];
*(p + 2) = '=';
*(p + 3) = '=';
}
} else {
// Deal with normal 3 Bytes of data
*(p + 1) = base64_alphabet[cmove_bits(*indata, 6, 2) + cmove_bits(*(indata + 1), 0, 4)];
*(p + 2) = base64_alphabet[cmove_bits(*(indata + 1), 4, 2) + cmove_bits(*(indata + 2), 0, 6)];
*(p + 3) = base64_alphabet[*(indata + 2) & 0x3f];
}
p += 4;
indata += 3;
}
if(outlen != NULL) {
*outlen = out_len;
}
return ret;
}
2.base64 Decoding principle
Reverse deduction , By each 4 Bytes ( Each byte contains 6 Bit significant bit ) Merge into 3 individual 8 Bit binary number .
2.1 Instance to explain
With “TWFu” For example , decode . Or look back at the first picture , Look up from the bottom .
Ideas
【 First step 】:‘TWFu’ The positions in the code index table are 19,22,5,46; Binary representations are 00010011、00010110、00000101、00101110, Their highest 2 Bit is invalid , For ever 0, Just take the low 6 A bit .
【 The second step 】: this 4 The significant bits of the number are 010011、010110、000101、101110.
【 The third step 】: take 4 The number of significant bits is combined into 24 The bit , Then it is divided into 3 Bytes ( use [] Cover up ).
[010011、01][0110、0001][01、101110]. The decimal systems are 77,97,110, That is to say ASCII code “Man”.
2.2 Organization Decode index table
To get the position of characters in the encoding index table , Look up the position of characters in the table every time ; In order to improve efficiency , You can compile a 128 Decoding index table of bytes , Such as the above “TWFu” Of ’T’, Corresponding 10 Into the system for 84, The position in the encoding index table is 19, Then we can decode the subscript of the index table 84 Position of 19; Empathy ,‘W’ Corresponding 10 Into the system for 87, The position in the encoding index table is 22, Then we can decode the subscript of the index table 87 Position of 22,64 Subscript positions corresponding to characters participating in encoding , Respectively house their index values in the coding index table . We name the decoding index table base64DecodeChars, So in this table , use C Language means , There is the following correspondence :
base64DecodeChars['T'] --- 19
base64DecodeChars['W'] --- 22
base64DecodeChars['F'] --- 5
base64DecodeChars['u'] --- 46
3. Complete code
#include <stdio.h>
#include <stdlib.h>
// base64 Conversion table , common 64 individual
static const char base64_alphabet[] = {
'A', 'B', 'C', 'D', 'E', 'F', 'G',
'H', 'I', 'J', 'K', 'L', 'M', 'N',
'O', 'P', 'Q', 'R', 'S', 'T',
'U', 'V', 'W', 'X', 'Y', 'Z',
'a', 'b', 'c', 'd', 'e', 'f', 'g',
'h', 'i', 'j', 'k', 'l', 'm', 'n',
'o', 'p', 'q', 'r', 's', 't',
'u', 'v', 'w', 'x', 'y', 'z',
'0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
'+', '/'};
// Decode with base64DecodeChars
static const unsigned char base64_suffix_map[256] = {
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 253, 255,
255, 253, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 253, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 62, 255, 255, 255, 63,
52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 255, 255,
255, 254, 255, 255, 255, 0, 1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
19, 20, 21, 22, 23, 24, 25, 255, 255, 255, 255, 255,
255, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48,
49, 50, 51, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255 };
static char cmove_bits(unsigned char src, unsigned lnum, unsigned rnum) {
src <<= lnum;
src >>= rnum;
return src;
}
int base64_encode( char *indata, int inlen, char *outdata, int *outlen) {
int ret = 0; // return value
if (indata == NULL || inlen == 0) {
return ret = -1;
}
int in_len = 0; // Source string length , If in_len No 3 Multiple , Then it needs to be supplemented with 3 Multiple
int pad_num = 0; // The number of characters to be completed , This is the only way 2, 1, 0(0 There's no need to splice , )
if (inlen % 3 != 0) {
pad_num = 3 - inlen % 3;
}
in_len = inlen + pad_num; // The length after splicing , The length of the actual encoding required (3 Multiple )
int out_len = in_len * 8 / 6; // Length after coding
char *p = outdata; // Define pointer to outgoing data The first address
// code , The length is the adjusted length , 3 A set of bytes
for (int i = 0; i < in_len; i+=3) {
int value = *indata >> 2; // take indata The first character moves to the right 2bit( discarded 2bit)
char c = base64_alphabet[value]; // Corresponding base64 Conversion table characters
*p = c; // Will correspond to the character ( Characters after encoding ) Assign a value to outdata First byte
// Deal with the last group ( Last 3 byte ) The data of
if (i == inlen + pad_num - 3 && pad_num != 0) {
if(pad_num == 1) {
*(p + 1) = base64_alphabet[(int)(cmove_bits(*indata, 6, 2) + cmove_bits(*(indata + 1), 0, 4))];
*(p + 2) = base64_alphabet[(int)cmove_bits(*(indata + 1), 4, 2)];
*(p + 3) = '=';
} else if (pad_num == 2) {
// The encoded data should be supplemented with two '='
*(p + 1) = base64_alphabet[(int)cmove_bits(*indata, 6, 2)];
*(p + 2) = '=';
*(p + 3) = '=';
}
} else {
// Deal with normal 3 Bytes of data
*(p + 1) = base64_alphabet[cmove_bits(*indata, 6, 2) + cmove_bits(*(indata + 1), 0, 4)];
*(p + 2) = base64_alphabet[cmove_bits(*(indata + 1), 4, 2) + cmove_bits(*(indata + 2), 0, 6)];
*(p + 3) = base64_alphabet[*(indata + 2) & 0x3f];
}
p += 4;
indata += 3;
}
if(outlen != NULL) {
*outlen = out_len;
}
return ret;
}
int base64_decode(const char *indata, int inlen, char *outdata, int *outlen) {
int ret = 0;
if (indata == NULL || inlen <= 0 || outdata == NULL || outlen == NULL) {
return ret = -1;
}
if (inlen % 4 != 0) {
// The data to be decoded is not 4 Byte multiples
return ret = -2;
}
int t = 0, x = 0, y = 0, i = 0;
unsigned char c = 0;
int g = 3;
//while (indata[x] != 0) {
while (x < inlen) {
// The data to be decoded corresponds to ASCII Value correspondence base64_suffix_map Value
c = base64_suffix_map[indata[x++]];
if (c == 255) return -1;// The corresponding value is not in the transcoding table
if (c == 253) continue;// The corresponding value is line feed or carriage return
if (c == 254) {
c = 0; g--; }// The corresponding value is '='
t = (t<<6) | c; // Put them in a sequence of int In the middle of the pattern 3 byte
if (++y == 4) {
outdata[i++] = (unsigned char)((t>>16)&0xff);
if (g > 1) outdata[i++] = (unsigned char)((t>>8)&0xff);
if (g > 2) outdata[i++] = (unsigned char)(t&0xff);
y = t = 0;
}
}
if (outlen != NULL) {
*outlen = i;
}
return ret;
}
边栏推荐
- 【FPGA教程案例25】通过NCO核和除法器实现tan(x)计算
- ORA-01033
- marginalization
- PMP practice once a day | don't get lost in the exam -7.14
- ORA-600:[qertbGetPartitionNumber:qesma2],[],[],[]
- Reading true questions | reading true questions record 2
- iptables屏蔽ip某个端口访问
- 职场必备 | 123页华为内部项目管理PPT
- PMP每日一练 | 考试不迷路-7.14
- Audio focus arbitration strategy
猜你喜欢

marginalization
![ORA-600:[qertbGetPartitionNumber:qesma2],[],[],[]](/img/08/5c9a27c0b488f76e3815ce76047082.png)
ORA-600:[qertbGetPartitionNumber:qesma2],[],[],[]

ABBYY FineReader 15标准版OCR文字识别及PDF编辑软工具

About XML files (V)

Basic knowledge of triode (Part 2) ②

Tencent employees post to find objects, indicating that they prefer programmers! Comments are hot Dark horse headlines

Filebeat collects kubernetes cluster logs

备赛笔记:神经网络

Box model, document flow, positioning, layout and responsive design

Install MySQL 5.7.23 in Linux
随机推荐
清楚临时表、查看临时表占用内存
Common and practical SQL statements
$attrs is readonly $listeners is readonly error reporting solution
【FPGA教程案例25】通过NCO核和除法器实现tan(x)计算
Feign 实现服务间并且调用时传递 header
iptables屏蔽ip某个端口访问
Use case interpretation: openinstall multi scenario application analysis
【锁相环】基于MATLAB的全数字锁相环设计与仿真
ES2022 Array.at( )
英语 | 阅读的逻辑 解题笔记
[phase locked loop] design and Simulation of all digital phase locked loop based on MATLAB
Tencent employees post to find objects, indicating that they prefer programmers! Comments are hot Dark horse headlines
【目标跟踪】基于背景消减的图像帧间差分法目标检测及matlab仿真
preg_replace 代码执行漏洞之[BJDCTF2020]ZJCTF,不过如此
leetcode445. Add two numbers II
Ubuntu 18.04 install mysql5.7.35 with tar package
SimpleDateFormat 的线程安全问题与解决方案
Single cell literature learning (Part4) -- scanpy: large scale single cell gene expression data analysis
2.4_ 9 MySQL by separator, row to column
Color supplement of MATLAB scientific research drawing (special part 6) - 336 traditional French colors