当前位置:网站首页>R language -- principle of Cox model calibration curve (I) data source
R language -- principle of Cox model calibration curve (I) data source
2022-07-19 12:47:00 【Eat two bites at a time】
List of articles
Preface
About cox The calibration curve of the model is drawn , You can search in the browser , Most of the parameter settings have been made very clear ( There is not much to say here ), But about the principle of calibration curve , About Calibrate each point on the graph 、 The meaning of the error line and how to calculate it , I believe most people don't quite understand . By reading the source code , Record what I learned here , It can also make some people better understand their own data and models .
One 、 What is the calibration curve ?
The calibration curve is a scatter plot of the actual occurrence rate and the predicted occurrence rate .
This article will introduce the points in the figure 、 Data behind the line . Let's take a look at an ordinary calibration curve ( Parameters not changed ).
Two 、 Calibration curve
1. How to draw ?
( The sending assistant said there was too little content , I use the parameter water point length )
Mainly drawing parameters :
- x: adopt calibrate Function to get object
- xlab:x Axis title
- ylab:y Axis title
- subtitles = TRUE: Subheading , Here refers to the notes at the bottom of the figure ( Small text ), The default is TRUE, You may need to set it when posting an article FALSE To remove .
- conf.int = TRUE: Whether it is necessary to draw error lines , The default is TRUE, draw
- cex.subtitles = 0.75: The text size of the sub title
- riskdist = TRUE: Whether to draw the shaft whisker diagram above the diagram , It's the thing like long hair
- add = FALSE, Whether to create a new graph or add it to another graph .
- scat1d.opts: This parameter is used to set the axis whisker graph , Don't pay too much attention to , Details refer to scat1d The parameters of the function .
- par.corrected: Parameter type is list, Parameters for setting error points . The default is NULL, The function will be automatically assigned as
list(col = "blue", lty = 1, lwd = 1, pch = 4), Your new parameter will overwrite the duplicate parameter . For example, editors may have strange requirements , Not that calibrated little one x spot , You can setpar.corrected=list(col='white).
dd=datadist(data)
options(datadist="dd")
f<-cph(Surv(DMFS new ,DMFS Outcome type )~ Histological subdivision +T Sub classification +S100,time.inc=60,data = data,x=TRUE,y=TRUE,surv=TRUE)
cal <- calibrate(f, cmethod="KM", method="boot", u=60, m= round(nrow(data)/3), B=1000)
cal
plot(cal,xlim=c(0,1),ylim=c(0,1))

2. How to understand ?
From the calibration curve, we can find , There is 6 individual points,3 A solid 3 individual x shape points, also 3 Error bars . combination cal The calculation result shows .
Solid points: Abscissa : yes calibrate function ( hereinafter referred to as cal) In the calculation results KM Column , The ordinate is :cal Medium mean.predicted Column .
x points: The ordinate of the forked point is cal Medium KM.corrected Column .
Error bar : The error is based on cal Medium std.err The specific calculation function calculated by the column is : Top :ifelse(KM == 0, 0, pmin(1, KM * exp(1.959964 * std.err))). lower end :ifelse(KM == 0, 0, KM * exp(1.959964 * (-std.err))).
Notes in the lower left and right corners : You can find some annotation information below the figure , The meanings are respectively :
- n=72 On behalf of 72 data
- d=29 The number of events representing the outcome is 29( That is, in the data 29 Example sutus=1)
- p=6 representative cox The coefficients in the model are 6 individual ( Be careful : The number of coefficients here is not the same as the number of variables modeled , Because the classification variable will have multiple coefficients )
- 24 subjects per group Indicates data grouping ( I will introduce it later when calculating ) when , Each group 24 Data
- Gray: ideal The gray line is the ideal line
- B=1000 Here again, it means that 1000 Secondary resampling calculation .
- Based on observed-predicted Express cal In the data index.orig How is this column calculated , It has no effect on drawing ( Dig a hole , I'll talk about it later ).
3. verification
Then run the following code according to the above to see whether the figure is the same .
## Prepare error bar chart data
errl <- ifelse(cal[,"KM"] == 0, 0, cal[,"KM"] * exp(1.959964 * (-cal[,"std.err"])))
errh <- ifelse(cal[,"KM"] == 0, 0, pmin(1, cal[,"KM"] * exp(1.959964 * cal[,"std.err"])))
## Draw an error line
errbar(x = cal[,"mean.predicted"],y = cal[,"KM"],yminus = errl,yplus = errh,
pch=16,cex=1.2,xlim = c(0,1),ylim = c(0,1),asp=1,xaxs='i',
yaxs='i',
xlab = 'Fraction surviving 60 Day',
ylab = 'Predicted 60 Days Survival')
## Add gray reference diagonal
abline(a = 0,b = 1,col='grey')
## Add the attachment
lines(x = cal[,"mean.predicted"],y = cal[,"KM"])
## Add the calibrated point
points(x = cal[,"mean.predicted"],y = cal[,"KM.corrected"],pch=4)
## Draw a whisker diagram
scat1d(x = attr(cal,"predicted"),nhistSpike = 200)

Can't say very similar , I can only say it's the same ( To show that I really don't just change the original drawing parameters , It is obvious that the gray reference line layer in the above figure is under the implementation ). Yes, of course , Due to the problem of coordinate proportion in the original function , The appearance may be different , But the meaning of the picture is really the same .
summary
In order to better understand the calibration curve , I went to see the source code of related functions , It can be regarded as one's own learning record .
This article is just the beginning , It's far from over , It will continue to be more .
My little white , If there is a mistake , Please criticize and correct me
边栏推荐
- Review the 2008 financial crisis and become a long-term investor 2020-03-19
- Ultrasonic sensor (chx01) learning notes Ⅲ - I2C reading and writing operation
- PostgreSQL function usage record
- 机器学习作业1
- Committer identity unknown *** Please tell me who you are...
- Travail du quatrième jour
- Enrollment publicity - Jiangnan University
- Opencv based on DLCO descriptor matching
- The difference and use between get request and post request
- 懒到骨子里了,我在CSDN写文章都懒得自己写了,基于selenium模拟写文章
猜你喜欢

Experiment the next day

OpenCV 教程 03: 如何跟踪视频中的某一对象

第一天实验

Dynamic memory planning

String correlation function (II)

The difference and use between get request and post request

Pytorch version: yolov4 integrating attention and mobilenet
MIHA tour 2023 autumn recruitment officially begins ~ early approval has the opportunity to avoid written examination!

S32K148_ Can drive (bare metal development)

Create a custom palette in swiftui color tutorial
随机推荐
Three minutes to understand the primary key, foreign key, non empty, unique and default constraints in mysql, and how to create a table
第四天作業
水调歌头·明月几时有
Talk about the redis cache penetration scenario and the corresponding solutions
go web
Editing skills
第一天实验
Ultrasonic sensor (ch101 & ch201) - I
稳超胜算,历9弥新 | 2022金仓创新产品发布会顺利召开
机器学习作业1
Relationship and difference between wav and PCM
逻辑运算符1(阁瑞钛伦特软件-九耶实训)
Ultrasonic sensor (chx01) learning notes Ⅲ - I2C reading and writing operation
Database daily question --- day 25: bank account summary II
Opencv:06 morphology
深度学习参数初始化(二)Kaiming初始化 含代码
Pytorch version: yolov4 integrating attention and mobilenet
动态内存规划
How to invest scientifically and rationally when the global financial crisis strikes? 2020-03-17
Differences between get requests and post requests and usage examples