当前位置:网站首页>面试快速复习(二):交叉熵为什么有用
面试快速复习(二):交叉熵为什么有用
2022-07-18 02:41:00 【锌a】
交叉熵CrossEntropy
多分类交叉熵公式为:
J = − 1 m ∑ i = 1 m ∑ k = 1 K y k ( i ) l o g ( p k ( i ) ) \Large J = -\frac{1}{m}\sum_{i=1}^m\sum_{k=1}^K y_k^{(i)}log(p_k^{(i)}) J=−m1i=1∑mk=1∑Kyk(i)log(pk(i))
其中 m m m表示样本数量, K K K表示类别数量, y k ( i ) y_k^{(i)} yk(i)表示第 i i i个样本第 k k k个类别的值,独热编码表示,当 y i y_i yi属于第 k k k类时为1,否者为0。 p k ( i ) p_k^{(i)} pk(i)表示第 i i i个样本的第 k k k类的预测分数。
由于独热编码的特殊表示,不等于该类时 y = 0 y=0 y=0,所以 ∑ k = 1 K y k ( i ) l o g ( p k ( i ) ) \sum_{k=1}^K y_k^{(i)}log(p_k^{(i)}) ∑k=1Kyk(i)log(pk(i)) 最后只有属于那一类的值有效,如此时真实类别为类别1,则 ∑ k = 1 K y k ( i ) l o g ( p k ( i ) ) = y 1 ( i ) l o g ( p 1 ( i ) ) = l o g ( p 1 ( i ) ) \sum_{k=1}^K y_k^{(i)}log(p_k^{(i)}) = y_1^{(i)}log(p_1^{(i)}) = log(p_1^{(i)}) ∑k=1Kyk(i)log(pk(i))=y1(i)log(p1(i))=log(p1(i))
所以其实对于一个单独的样本,只需要考虑一组 1 1 1与 p p p的关系
画出 y = l o g ( x ) y = log(x) y=log(x)的函数图

可以看出,当 p p p接近0时, − l o g ( p 1 ( i ) ) -log(p_1^{(i)}) −log(p1(i))越大(本来是负值,加上交叉熵 J J J前面的负号就是正的了),越接近1损失越小
如 p = 0.03 p = 0.03 p=0.03时, l o g ( p ) = − 3.50 log(p) = -3.50 log(p)=−3.50, l o s s = − l o g ( p ) = 3.50 loss = -log(p) = 3.50 loss=−log(p)=3.50
p = 0.5 p = 0.5 p=0.5时, l o g ( p ) = − 0.69 log(p) = -0.69 log(p)=−0.69, l o s s = − l o g ( p ) = 0.69 loss = -log(p) = 0.69 loss=−log(p)=0.69
这样将多分类任务转化成了在log函数上的模型预测值与1之间的距离关系,距离1越近损失越低,距离0越近损失越高
边栏推荐
- Openresty Lua resty lrucache cache
- Condition judgment function of MySQL function summary
- JMeter 21 天打卡 day09
- Entropy technology passed the registration: the annual revenue was 1.955 billion, and the book balance of accounts receivable was 290million
- MyCat2启动报[MYCAT-3036][ERR_INIT_CONFIG] start FileMetadataStorageManager fail
- 05.位图和比较器的简单应用
- MATLAB学习第四天(决策语句)
- 向数据库表中插入中文数据报错“1366 (HY000): Incorrect string value: ‘\xE5\x90\x95\xE5‘ for column ‘name‘ at row 1“
- torch dist分布式数据汇总
- YoloV7:基于自己训练的模型如何导出正确的ONNX
猜你喜欢

熵基科技通过注册:年营收19.55亿 应收账款账面余额2.9亿

劲旅环境深交所上市:市值50亿 于晓霞家族色彩浓厚

Handwritten simple promise code comments

Deep learning environment configuration pytoch

论文阅读_医疗NLP_ SMedBERT

Design and implementation of tcp/ip protocol stack LwIP: Part 4
![[foundation of deep learning] how to calculate convolution](/img/96/8c0dbe19a4c1f2fa84b95ea8b1b345.jpg)
[foundation of deep learning] how to calculate convolution

OS知识点简介(一)

Stm32+a4988 control stepper motor
![[C language] 10000 word document operation summary](/img/3b/8f6c9b464f2b0c30c12fc91ad915e2.png)
[C language] 10000 word document operation summary
随机推荐
Gesture Recognition Dataset: Jester 数据集解压
CANoe:.vmodule文件是什么
Mycat2 start message [mycat-3036][err_init_config] start filemetadatastoragemanager fail
Basic use of anaconda and its use in pychart
模板的初识
Development of face recognition gate based on deep learning (based on paddlepaddle)
Utiliser l'apprentissage profond pour faire des dessins de cerveau robotique
JMeter 21 天打卡 day13
Jinlv environment listed on Shenzhen Stock Exchange: market value of 5billion Yu Xiaoxia family has a strong color
Weekly resume of personal IP lab · issue 19
熵基科技通过注册:年营收19.55亿 应收账款账面余额2.9亿
07. Advanced application of binary tree
Wild pointer problem: review orange Technology
Summary of general test points for file upload
[立创&传智&黑马程序员&CSDN]训练营——仿生机械狗
关于df命令由于设备名太长自动换行的问题
Groovy learning
为什么很多人都知道打工不挣钱却还在打工?
[C language] 10000 word document operation summary
TCP/IP协议栈Lwip的设计与实现:之四