当前位置：网站首页>R language -- principle of Cox model calibration curve (I) data source

R language -- principle of Cox model calibration curve (I) data source

2022-07-19 12:47:00 【Eat two bites at a time】

List of articles

Preface
One 、 What is the calibration curve ？
Two 、 Calibration curve
summary

Preface

About cox The calibration curve of the model is drawn , You can search in the browser , Most of the parameter settings have been made very clear （ There is not much to say here ）, But about the principle of calibration curve , About Calibrate each point on the graph 、 The meaning of the error line and how to calculate it , I believe most people don't quite understand . By reading the source code , Record what I learned here , It can also make some people better understand their own data and models .

One 、 What is the calibration curve ？

The calibration curve is a scatter plot of the actual occurrence rate and the predicted occurrence rate .
This article will introduce the points in the figure 、 Data behind the line . Let's take a look at an ordinary calibration curve （ Parameters not changed ）.

Two 、 Calibration curve

1. How to draw ？

（ The sending assistant said there was too little content , I use the parameter water point length ）
Mainly drawing parameters ：

x： adopt calibrate Function to get object
xlab：x Axis title
ylab：y Axis title
subtitles = TRUE： Subheading , Here refers to the notes at the bottom of the figure （ Small text ）, The default is TRUE, You may need to set it when posting an article FALSE To remove .
conf.int = TRUE： Whether it is necessary to draw error lines , The default is TRUE, draw
cex.subtitles = 0.75： The text size of the sub title
riskdist = TRUE： Whether to draw the shaft whisker diagram above the diagram , It's the thing like long hair
add = FALSE, Whether to create a new graph or add it to another graph .
scat1d.opts： This parameter is used to set the axis whisker graph , Don't pay too much attention to , Details refer to scat1d The parameters of the function .
par.corrected： Parameter type is list, Parameters for setting error points . The default is NULL, The function will be automatically assigned as list(col = "blue", lty = 1, lwd = 1, pch = 4), Your new parameter will overwrite the duplicate parameter . For example, editors may have strange requirements , Not that calibrated little one x spot , You can set par.corrected=list(col='white).

dd=datadist(data)   
options(datadist="dd")
f<-cph(Surv(DMFS new ,DMFS Outcome type )~ Histological subdivision +T Sub classification +S100,time.inc=60,data = data,x=TRUE,y=TRUE,surv=TRUE) 
cal <- calibrate(f, cmethod="KM", method="boot", u=60, m=  round(nrow(data)/3), B=1000)
cal
plot(cal,xlim=c(0,1),ylim=c(0,1))

Insert picture description here

2. How to understand ？

From the calibration curve, we can find , There is 6 individual points,3 A solid 3 individual x shape points, also 3 Error bars . combination cal The calculation result shows .
Solid points： Abscissa ： yes calibrate function （ hereinafter referred to as cal） In the calculation results KM Column , The ordinate is ：cal Medium mean.predicted Column .
x points： The ordinate of the forked point is cal Medium KM.corrected Column .
Error bar ： The error is based on cal Medium std.err The specific calculation function calculated by the column is ： Top ：ifelse(KM == 0, 0, pmin(1, KM * exp(1.959964 * std.err))). lower end ：ifelse(KM == 0, 0, KM * exp(1.959964 * (-std.err))).
Notes in the lower left and right corners ： You can find some annotation information below the figure , The meanings are respectively ：

n=72 On behalf of 72 data
d=29 The number of events representing the outcome is 29（ That is, in the data 29 Example sutus=1）
p=6 representative cox The coefficients in the model are 6 individual （ Be careful ： The number of coefficients here is not the same as the number of variables modeled , Because the classification variable will have multiple coefficients ）
24 subjects per group Indicates data grouping （ I will introduce it later when calculating ） when , Each group 24 Data
Gray: ideal The gray line is the ideal line
B=1000 Here again, it means that 1000 Secondary resampling calculation .
Based on observed-predicted Express cal In the data index.orig How is this column calculated , It has no effect on drawing （ Dig a hole , I'll talk about it later ）.

3. verification

Then run the following code according to the above to see whether the figure is the same .

## Prepare error bar chart data 
errl <- ifelse(cal[,"KM"] == 0, 0,  cal[,"KM"] * exp(1.959964 * (-cal[,"std.err"])))
errh <- ifelse(cal[,"KM"] == 0, 0, pmin(1, cal[,"KM"] * exp(1.959964 * cal[,"std.err"])))
## Draw an error line 
errbar(x = cal[,"mean.predicted"],y = cal[,"KM"],yminus = errl,yplus = errh,
       pch=16,cex=1.2,xlim = c(0,1),ylim = c(0,1),asp=1,xaxs='i',
       yaxs='i',
       xlab = 'Fraction surviving 60 Day',
       ylab = 'Predicted 60 Days Survival')
## Add gray reference diagonal 
abline(a = 0,b = 1,col='grey')
## Add the attachment 
lines(x = cal[,"mean.predicted"],y = cal[,"KM"])
## Add the calibrated point 
points(x = cal[,"mean.predicted"],y = cal[,"KM.corrected"],pch=4)
## Draw a whisker diagram 
scat1d(x = attr(cal,"predicted"),nhistSpike = 200)

Insert picture description here

Can't say very similar , I can only say it's the same （ To show that I really don't just change the original drawing parameters , It is obvious that the gray reference line layer in the above figure is under the implementation ）. Yes, of course , Due to the problem of coordinate proportion in the original function , The appearance may be different , But the meaning of the picture is really the same .

summary

In order to better understand the calibration curve , I went to see the source code of related functions , It can be regarded as one's own learning record .

This article is just the beginning , It's far from over , It will continue to be more .

My little white , If there is a mistake , Please criticize and correct me

原网站

版权声明
本文为[Eat two bites at a time]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/200/202207171653272107.html