生存分析是生物信息醫(yī)學(xué)領(lǐng)域中最常見的一類分析方法,。其應(yīng)用主要包括幾個方面: 二是研究biomarker在癌癥中的預(yù)后效能,;常見如對于同一種癌癥類型使用放療的患者與使用化療的患者之間的生存是否存在顯著差異,,從而判斷使用哪種治療方法更有利于患者的生存,。 前面小編給大家介紹過 ? 生存曲線怎么看,怎么畫 ? 零代碼生存曲線—ENCORI篇 ? 零代碼生存曲線—GEPIA篇 ? 零代碼發(fā)文神器—UALCAN 今天我們來看看怎么利用R來繪制生存曲線,。 生存分析不同于其它多因素分析的主要區(qū)別點就是生存分析考慮了每個觀測出現(xiàn)某一結(jié)局的時間長短,。因此要順利的畫出生存曲線,首先需要理解兩個概念,。其一是生存時間,,其二是終點事件。弄清楚了這兩個概念,。我們就開始為畫圖做準(zhǔn)備工作啦,。首先從TCGA下載臨床數(shù)據(jù),。從TCGA下載數(shù)據(jù)有很多方法和教程這里就不多加贅述啦,。教程雖然多,但是拿到數(shù)據(jù)如何處理為生存分析時需要的數(shù)據(jù)格式呢,?上面我們說過生存資料的兩個變量:結(jié)局事件和生存時間,,要想畫出生存曲線,至少需要包含這兩列數(shù)據(jù),。下面以腎透明細胞癌KIRC數(shù)據(jù)為例進行代碼實戰(zhàn),。TCGA-KIRC.GDC_phenotype.tsv文件從https://gdc./download/TCGA-KIRC.GDC_phenotype.tsv.gz獲取kirc.phenotype <- read.csv("TCGA-KIRC.GDC_phenotype.tsv", header = T, sep = "\t") # 提取腫瘤的表型信息 tumor.sample <- sapply(as.character(kirc.phenotype$submitter_id.samples), function(x){ number <- as.numeric(unlist(strsplit(unlist(strsplit(as.character(x), split="-"))[4], split = "[A-Z]"))[1]) if(number<=9){ id <- as.character(x) return(id) } }) tumor.sample.id <- unlist(tumor.sample) tumor.kirc.phenotype <- kirc.phenotype[match(tumor.sample.id, kirc.phenotype$submitter_id.samples), ] rownames(tumor.kirc.phenotype) <- tumor.kirc.phenotype$submitter_id.samples # 獲取唯一的腫瘤樣本以及表型矩陣 uniq.sample.id <- unique(tumor.kirc.phenotype$submitter_id) uniq.tumor.kirc.phenotype <- tumor.kirc.phenotype[match(uniq.sample.id, tumor.kirc.phenotype$submitter_id), ] rownames(uniq.tumor.kirc.phenotype) <- uniq.tumor.kirc.phenotype$submitter_id # 獲取OS time os.time <- rep(0, nrow(uniq.tumor.kirc.phenotype)) dead.index <- which(uniq.tumor.kirc.phenotype$days_to_death > 0) alive.index <- which(uniq.tumor.kirc.phenotype$days_to_last_follow_up.diagnoses > 0) os.time[dead.index] <- uniq.tumor.kirc.phenotype$days_to_death[dead.index] os.time[alive.index] <- uniq.tumor.kirc.phenotype$days_to_last_follow_up.diagnoses[alive.index] # 獲取終點事件,并用0,,1表示 os.index <- which(os.time > 0) os.time <- os.time[os.index] status <- rep(0, nrow(uniq.tumor.kirc.phenotype)) dead.index <- which(uniq.tumor.kirc.phenotype$vital_status.demographic == "Dead") alive.index <- which(uniq.tumor.kirc.phenotype$vital_status.demographic == "Alive") status[dead.index] <- 0 status[alive.index] <- 1 status <- status[os.index] uniq.tumor.kirc.phenotype <- uniq.tumor.kirc.phenotype[os.index, ] # 提取感興趣的表型信息 interesting.tumor.kirc.data <- data.frame(gender = uniq.tumor.kirc.phenotype$gender.demographic, age = uniq.tumor.kirc.phenotype$age_at_initial_pathologic_diagnosis, status = status, OS = os.time) rownames(interesting.tumor.kirc.data) <- rownames(uniq.tumor.kirc.phenotype)
(向左滑動查看更多) 得到OS數(shù)據(jù)后,,接下來就進入到畫圖階段啦。讓我們跟著下面的代碼,,一步一步的畫出KM曲線,。 # step1 加載R包 library(survival) library(survminer) # step2 使用Surv()函數(shù)創(chuàng)建生存數(shù)據(jù)對象(生存時間、終點事件) # step3 再用survfit()函數(shù)對生存數(shù)據(jù)對象擬合生存函數(shù),,創(chuàng)建KM(Kaplan-Meier)生存曲線 plot.interesting.tumor.kirc.data <- survfit(Surv(interesting.tumor.kirc.data$OS,event=interesting.tumor.kirc.data$status)~gender, data=interesting.tumor.kirc.data) # step4 使用ggsurvplot函數(shù)進行畫圖 ggsurvplot(plot.interesting.tumor.kirc.data , pval = T, legend.title = "sex", legend="right", #font.main = c(16, "bold", "darkblue"), #censor.shape=26, palette = c("blue","red"), censor=F, xlab="overall survival", title="survival analysis")
(向左滑動查看更多)
下面我們基于M分期來畫生存曲線,。如果對腫瘤TNM分期還不了解的小伙伴可以參考腫瘤TNM分期。 #pathologic_M的生存曲線,,三個分期interesting.tumor.kirc.data <- data.frame(pathologic_M = uniq.tumor.kirc.phenotype$pathologic_M, age = uniq.tumor.kirc.phenotype$age_at_initial_pathologic_diagnosis, status = status, OS = os.time)rownames(interesting.tumor.kirc.data) <- rownames(uniq.tumor.kirc.phenotype)index=pathologic_M %in% c("M0","M1","MX")interesting.tumor.kirc.data=interesting.tumor.kirc.data[index,]plot.interesting.tumor.kirc.data <- survfit(Surv(interesting.tumor.kirc.data$OS,event=interesting.tumor.kirc.data$status)~pathologic_M, data=interesting.tumor.kirc.data)# step4 使用ggsurvplot函數(shù)進行畫圖ggsurvplot(plot.interesting.tumor.kirc.data , pval = T, conf.int = TRUE, legend.title = "pathologic_M", legend="right", surv.median.line = "hv", palette = c("blue","red","green"), censor=F, risk.table=T, xlab="overall survival", title="survival analysis") 以上就是快速上手畫出生存曲線圖的全部內(nèi)容了,,你會了嗎?感興趣的小伙伴們也可以動手試一試,。
|