MCPcounter實戰(zhàn)點(diǎn)點(diǎn)滴滴

連果啦啦 2020-04-15

展開全文

第一

我得到的數(shù)據(jù)是nrpm值,，處理的方法如下,?？紤]了管家基因后的normalization,。然后limma做差異分析。

image.png

我的問題：能不能使用原始read counts數(shù)據(jù)用DEseq2做normalization,？（DEseq2做normilization時需要使用為整數(shù)的read counts,，不然會報錯）
回答：DEseq2做normalization時沒有考慮到管家基因,，所以不建議用此方法

第二

我自己在給我的數(shù)據(jù)nrpm上直接做了PCA圖：nrpm數(shù)據(jù)格式和PCA圖如下:

由于數(shù)據(jù)差異較大,，PCA圖不是特別好看,，所以建議可以Zscore或者logratio再次進(jìn)行歸一化。代碼參考https://www.jianshu.com/p/57f62efa0fab

b<- scale(expr)a<- log(expr+1)########zscore就是scale

第三

需要研究一下MCPcount是需要read counts還是可以用normalization的值
但是Xcell寫的很明確,，read counts或者normalization的值都可以,，因為它只做ranking

MCPcounter方法學(xué)原文中使用TCGA數(shù)據(jù)庫驗證時，使用了normalized results,。所以解釋了我的疑問,。
查考文章：Estimating the population abundance of tissue-infiltrating immune and stromal cell populations using gene expression

image.png

MCP counter和cibersort最大的區(qū)別就是cibersort是計算淋巴細(xì)胞的比例，二MCPcounter是一個決定計數(shù)方法,。

第四

我自己的數(shù)據(jù)表明

Primary tumor

image.png	image.png

Metastasis lesion

image.png	image.png

參考文章Estimating the population abundance of tissue-infiltrating immune and stromal cell populations using gene expression
此文章對Lung Adenocarcinoma的數(shù)據(jù)進(jìn)行了分析發(fā)現(xiàn),，將患者分成四組Blinegae 和Tcell 高表達(dá)的患者預(yù)后最好，我們的數(shù)據(jù)有一樣的發(fā)現(xiàn)??！奠定了這一篇文章的基礎(chǔ)。
后續(xù)我也會將分組信息變成四組進(jìn)行分析

image.png

第五

MCPcounter進(jìn)行TCGA數(shù)據(jù)挖掘的方法

只需要構(gòu)建一個表達(dá)矩陣：colname是sample_name,，rownames是基因symbol或者另外兩個ID（不記得了）featuresType=c('HUGO_symbols')其實有三種可選

代碼如下

library(curl)library(MCPcounter)??MCPcounter.estimateload(file = 'TCGA.Rdata')exprMatrix[1:10,1:10]####################ensemble整潔版ID####library(clusterProfiler)library(org.Hs.eg.db)ls('package:org.Hs.eg.db')g2s=toTable(org.Hs.egSYMBOL);head(g2s)g2e=toTable(org.Hs.egENSEMBL);head(g2e)tmp=merge(g2e,g2s,by='gene_id')head(tmp)colnames(exprMatrix)[ncol(exprMatrix)] <- c('ensembl_id')###################重命名Ensemble_ID 便于后面mergeexprMatrix<- merge(tmp,exprMatrix,by='ensembl_id')exprMatrix<- exprMatrix[,- c(1,2)]exprMatrix=exprMatrix[!duplicated(exprMatrix$symbol),]row.names(exprMatrix)<- exprMatrix[,1]exprMatrix<- exprMatrix[,-1]probesets=read.table(curl('http://raw./ebecht/MCPcounter/master/Signatures/probesets.txt'),sep='\t',stringsAsFactors=FALSE,colClasses='character')genes=read.table(curl('http://raw./ebecht/MCPcounter/master/Signatures/genes.txt'),sep='\t',stringsAsFactors=FALSE,header=TRUE,colClasses='character',check.names=FALSE)results<- MCPcounter.estimate(exprMatrix,featuresType=c('HUGO_symbols')[1],                    probesets=probesets,                    genes=genes)

第五

修改生存曲線圖例legend 參考資料：
http://www./english/wiki/survminer-r-package-survival-data-analysis-and-visualization
待續(xù),。。,。。,。

本站是提供個人知識管理的網(wǎng)絡(luò)存儲空間,，所有內(nèi)容均由用戶發(fā)布，不代表本站觀點(diǎn),。請注意甄別內(nèi)容中的聯(lián)系方式,、誘導(dǎo)購買等信息，謹(jǐn)防詐騙,。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容,，請點(diǎn)擊一鍵舉報。

轉(zhuǎn)藏 分享

QQ空間 QQ好友新浪微博微信

獻(xiàn)花（0） +1

來自：連果啦啦 > 《R語言》

舉報/認(rèn)領(lǐng)