都來(lái)自于aviv Regev實(shí)驗(yàn)室,一系列文章都利用了單細(xì)胞轉(zhuǎn)錄組數(shù)據(jù)分析CNV,。 2014年關(guān)于GBM的science文章首先是2014年關(guān)于GBM的science文章,;PMID: 24925914 ,提到了這個(gè)分析點(diǎn),,然后還用了CCLE數(shù)據(jù)庫(kù)驗(yàn)證可靠性,。 該文章自己的單細(xì)胞轉(zhuǎn)錄組數(shù)據(jù)建庫(kù)選用了 SMART-seq 方法,公布在 GSE57872 這個(gè)單細(xì)胞轉(zhuǎn)錄組建庫(kù)方式有點(diǎn)落后了: SMART-seq protocol was implemented to generate single cell full length transcriptomes (modified from Shalek, et al Nature 2013) and sequenced using 25 bp paired end reads. Single cell cDNA libraries for MGH30 were resequenced using 100 bp paired end reads to allow for isoform and splice junction reconstruction (96 samples, annotated MGH30L).
所以作者過(guò)濾的比較嚴(yán)格,,可以直接下載其分析好的表達(dá)矩陣,,也可以下載原始測(cè)序數(shù)據(jù)自己走一波轉(zhuǎn)錄組流程。 第一次提出的公式如下:
2016年關(guān)于melanoma的science文章然后是2016年關(guān)于melanoma的science文章:PMID: 27124452 也應(yīng)用了單細(xì)胞轉(zhuǎn)錄組數(shù)據(jù)分析CNV,,該文章的數(shù)據(jù)公布在 GSE72056 這次使用的Smart-seq2 建庫(kù)技術(shù),,共計(jì) 4645 個(gè)細(xì)胞,僅僅是表達(dá)矩陣就由71Mb,,但是原始的測(cè)試數(shù)據(jù)在 dbGaP 數(shù)據(jù)庫(kù),,需要申請(qǐng)才能下載。 we applied single-cell RNA sequencing (RNA-seq) to 4645 single cells isolated from 19 patients, profiling malignant, immune, stromal, and endothelial cells.
值得注意的是作者還做了bulk的轉(zhuǎn)錄組測(cè)序,,針對(duì)6個(gè)處理 RAF or RAF+MEK inhibitors 前后供12個(gè)數(shù)據(jù),,公布在 GSE77940 這個(gè)時(shí)候的計(jì)算公式稍微有點(diǎn)變化了,,如下:
2016年CELL雜志發(fā)表的關(guān)于頭頸癌接著是2016年CELL雜志發(fā)表的關(guān)于頭頸癌的文章:Single-Cell Transcriptomic Analysis of Primary and Metastatic Tumor Ecosystems in Head and Neck Cancer 測(cè)序如下; We profiled transcriptomes of ~6,000 single cells from 18 head and neck squamous cell carcinoma (HNSCC) patients, including five matched pairs of primary tumors and lymph node metastases.
同時(shí)也對(duì)這些病人測(cè)了whole-exome sequencing (WES) and targeted genotyping (SNaPshot) data,,但是這些數(shù)據(jù)公布在 phs001474.v1.p1 ,,不是很方便下載。 單細(xì)胞轉(zhuǎn)錄組建庫(kù)用的Smart-seq2 方法,,所有的數(shù)據(jù)公布在 GSE103322 ,, 僅僅是表達(dá)矩陣都有近100Mb了。 GSE103322_HNSCC_all_data.txt.gz | 86.0 Mb |
下載地址是: (ftp)(http)
用CCLE數(shù)據(jù)做驗(yàn)證2014年關(guān)于GBM的science文章,;PMID: 24925914 ,,文章提到: We downloaded the CCLE gene-centric RMA-normalized Affymetrix data (http://www./ccle/), and centered the expression of each gene across all cell lines at zero.
需要簡(jiǎn)單注冊(cè)后才能下載:https://portals./ccle/users/sign_in 理論上要得到下面的圖: 說(shuō)明使用轉(zhuǎn)錄組數(shù)據(jù)分析到的CNV情況和SNP6.0芯片的結(jié)果差異不大。 還有GTEx數(shù)據(jù)庫(kù)的驗(yàn)證To compare these patterns to an external reference of normal cells we downloaded RNA-Seq data from the GTEX portal (http://www./; gene read counts file from Jan. 2013), and estimated CNV values as above: we normalized the read counts into log2(TPM+1), averaged all brain samples, restricted the data to the ~6,000 analyzed genes, subtracted for each gene the average normalized expression from the GBM single-cell data (this step is comparable to the centering of the single cell data) and then used a moving average of 100 genes over the genomically-ordered list of genes to define CNV-cont.
總結(jié)上述文章及數(shù)據(jù)都是有表達(dá)矩陣可以下載,,所以僅僅是根據(jù)這些文章的補(bǔ)充材料公布的公式即可重復(fù)整個(gè)流程啦,。 點(diǎn)擊加入單細(xì)胞數(shù)據(jù)處理學(xué)習(xí)交流小組
|