文獻(xiàn)閱讀1
文獻(xiàn)題目: Experimental validation of methods for differential gene expression analysis and sample pooling in RNA-seq 文獻(xiàn)來源:BMC genomic-2015 文獻(xiàn)摘要(譯):
背景: 大規(guī)模平行cDNA測序(RNA-seq)實(shí)驗(yàn)在基因表達(dá)定量分析上,,逐步取代了芯片技術(shù)。但是,,許多生物學(xué)家對于差異基因分(DEG)的方法和在RNA-seq實(shí)驗(yàn)中采用省錢的樣本混池策略的可靠性存在疑惑,。因此,我們在RNA-seq實(shí)驗(yàn)中對Cuffdiff2, edgeR,DESeq2和Two-stage Poisson Model(TSPM)鑒定到的DEGs,,在老鼠扁桃腺進(jìn)行微穿孔,,使用高通量qRCR對獨(dú)立生物學(xué)重復(fù)樣本進(jìn)行驗(yàn)證。另外,我們對RNA混池樣本測序,,并將其結(jié)果與相應(yīng)獨(dú)立測序樣本比較,。
結(jié)果: Cuffdiff2 的假陽性率和 DESeq2與TSPM的假陰性率很高。在四種調(diào)查的DEG分析方法中,,edgeR的靈敏度和準(zhǔn)確度相對較高,。我們記錄了混池的偏見,并且混池樣本鑒定到的DEG具有很低的陽性預(yù)測值,。
結(jié)論: 我們的結(jié)果表明組合使用靈敏度更高的DEG分析方法,,及在未來RNA-seq實(shí)驗(yàn)中對鑒定到的DEGs進(jìn)行高通量驗(yàn)證是必須的。這些結(jié)果表明對于RNA-seq實(shí)驗(yàn)在相似的設(shè)置上需要限制利用混池策略,,并且增加樣本的生物學(xué)重復(fù),。
和之前研究一致的發(fā)現(xiàn): (1).DESeq具有低靈敏度 (2).Cuffdiff具有高的假陽性 (3).edgeR具有高的靈敏度 (4).TSPM的假陽性率和假陰性率依賴于重復(fù)的數(shù)量 所以目前大家普遍推薦使用edgeR和DESeq2,cuffdiff不建議使用了
DEG分析方法的差異: 方法 | edgeR | DESeq2 | Cuffdiff2 | TSPM | 標(biāo)準(zhǔn)化 | a model, which incorporates normalisation factors as offsets that are estimated by trimmed mean of M values for eachcontig | a relative log expression method | consider total number of reads, gene length, variability within and between the conditions, and differential isoform expression | accommodate various normalisation procedures, but works without normalisation by default | 分布 | Poisson distribution | negative binomial distribution | negative binomial distribution |
| 分布預(yù)測 | edgeR moderates its dispersion estimates by their dispersion-mean relationship | DESeq2 is stringent to detect outliers and excludes genes with extreme read counts by default.It considers the maximum a posteriori dispersion estimates | Cuffdiff2 includes covariances between different isoforms | TSPM differs by its per-gene dispersion estimation without considering the information across genes . | 計算p values | generalized linear model (GLM) likelihood ratio test | generalized linear model (GLM) likelihood ratio test | generalized linear model (GLM) likelihood ratio test | employs quasi or standard likelihood ratio tests, based on whether a gene is over-dispersed or not |
這些方法的主要差別在于分布預(yù)測過程不同
從A,C圖都可以看出,,混池的結(jié)果相較于對應(yīng)的獨(dú)立樣本,,鑒定到的DEGs數(shù)量顯著偏多。因?yàn)榛斐叵喈?dāng)于求平均值,,會丟失異常值信息以及組內(nèi)差異大小信息,。所以混池建庫測序會低估組內(nèi)變異,導(dǎo)致很多低陽性預(yù)測值的DEGs被鑒定到,。 從A,C圖的比較可以看出,,8個混池樣本的鑒定到的DEGs(18055)少于3個混池樣本鑒定到的DEGs(15745);對于獨(dú)立樣本,,情況也是如此(82 vs 16),,所以增加生物學(xué)重復(fù)可以縮小混池對于預(yù)測差異表達(dá)基因的偏差。 B,D圖的比較也可以說明增加生物學(xué)重復(fù)可以增加對于群體變異預(yù)測的能力,,并且降低混池偏差和假陽性率,。
RNA-seq分析 RNA質(zhì)量檢測:
NanoDrop 1000: 264 ngRNA/sample Agilent 2100 Bioanalyzer: RNA Integrity Number (RIN) 7.53(SD 0.31)
Total RNA Small RNA miRNA (10-40nt)占small RNA(10-150nt)的比例 RIN >=7,good; RIN between 6 and 7,sometimes can also get good results,if the samples are extremely precious,worth try 28S/18S > 0.7 Fluorescent unit >1
資料鏈接: http://www.docin.com/p-769106334.html
DEG分析流程 Quality control > Aliment to mouse genome (TopHat 2.0.6)>
Aligned reads count (HTSeq 0.54) > DEG analysis (edgeR 3.2.4 Cuffdiff 2.1.1,DESeq 2 1.0.19 and TSPM)
adjusted p values less than 0.05 were considered as DEGs (BenjaminiHochberg false discovery correction)
通過qPCR驗(yàn)證DEGs的標(biāo)準(zhǔn) 對于在RNA-seq分析中鑒定到的DGE,如果滿足以下標(biāo)準(zhǔn),,則被視為真陽性DEG: 1.RNA-seq 和 qPCR都顯示相同的差異表達(dá)方向(上調(diào)或者下調(diào)) 2.由qPCR預(yù)測得到的差異表達(dá)倍數(shù)改變要么高于1.25倍,,要么低于 0.8(LCF 界限為±0.3219) 3.Spearman相關(guān)系數(shù),均方根偏差,,kappa統(tǒng)計量使用STATA 13.1計算得到
原文鏈接:
https://www.ncbi.nlm./pmc/articles/PMC4515013/
|