【原】monocle3軌跡分析

生信探索 2023-04-24 發(fā)布于云南

展開全文

monocle3與PAGA有點(diǎn)類似,，在UMAP圖上顯示軌跡圖,，沒有了樹狀的結(jié)構(gòu)。

原理,、圖的理解,，可以參考Reference中的鏈接

安裝

ubuntu

sudo apt install libudunits2-dev libgdal-dev

speedglm包違反CRAN規(guī)定被刪除了,，所以要從rstudio下載

using(pak)
pkgs <- c('BiocGenerics', 'DelayedArray', 'DelayedMatrixStats',
  'limma', 'lme4', 'S4Vectors', 'SingleCellExperiment',
  'SummarizedExperiment', 'batchelor', 'HDF5Array',
  'terra', 'ggrastr',"hhoeflin/hdf5r","mojaveazure/loomR@develop",
  "url::https://packagemanager./cran/2023-03-31/src/contrib/speedglm_0.3-4.tar.gz")
pak::pkg_install(pkgs)

0. 加載R包,、定義函數(shù)

using(monocle3, tidyverse, magrittr, Matrix, Seurat, SeuratObject, sp)
ps <- function(filename, plot = FALSE, w = 9, h = 6) {
    if (is.object(plot)) {
        print(plot)
    }
    plot <- recordPlot()
    pdf(file = paste0(filename,".pdf"), onefile = T, width = w, height = h)
    replayPlot(plot)
    dev.off()
}
n_jobs <- 8 # 使用的線程數(shù)

1.創(chuàng)建monocle對象

sobj：Seurat對象
cell_type：已經(jīng)注釋好了細(xì)胞類型
orig.ident：批次信息
sobj_embed：UMAP降維信息，是數(shù)據(jù)框,，行名是細(xì)胞,，有兩列分別對應(yīng)兩個維度
hvg：3000個高變基因用于后續(xù)降維
expression_matrix, a numeric matrix of expression values, where rows are genes, and columns are cells
cell_metadata, a data frame, where rows are cells, and columns are cell attributes (such as cell type, culture condition, day captured, etc.)
gene_metadata, an data frame, where rows are features (e.g. genes), and columns are gene attributes, such as biotype, gc content, etc. one of the columns of the gene_metadata should be named "gene_short_name", which represents the gene symbol or simple name (generally used for plotting) for each gene.

sobj <- readRDS("final.rds")
gene_metadata <- sobj@[email protected]
gene_metadata$gene_short_name <- rownames(gene_metadata)
cds <- new_cell_data_set(as(sobj@assays$RNA@counts,"sparseMatrix"),
    cell_metadata = [email protected],
    gene_metadata = gene_metadata
)
hvg <- seob@[email protected]

sobj_embed <- Seurat::Embeddings(seob, reduction = "umap")

rm(sobj, gene_metadata)

2.pre-process

標(biāo)準(zhǔn)化、預(yù)處理,、取批次

## Normalize and pre-process the data
cds %<>% preprocess_cds(num_dim = 30,method="PCA",norm_method="log",use_genes=hvg)
## Remove batch effects with cell alignment
cds %<>% align_cds(alignment_group = "orig.ident",preprocess_method="PCA")

3.Reduce dimensions and Cluster cells

降維,、聚類、分群,、分partition

這里使用UMAP作為降維算法,，再使用軌跡分區(qū)算法，把所有細(xì)胞分為兩個partitio,，不同分區(qū)的細(xì)胞會進(jìn)行單獨(dú)的軌跡分析,。

plot_pc_variance_explained(cds)
ps("PCA.pdf", w = 9, h = 6)

## Reduce the dimensions using UMAP
cds %<>% reduce_dimension(umap.fast_sgd = TRUE, cores = n_jobs, reduction_method = "UMAP", preprocess_method = "PCA")

## 從seurat導(dǎo)入整合過的umap坐標(biāo)
cds_embed <- cds@int_colData$reducedDims$UMAP
cds@int_colData$reducedDims$UMAP <- sobj_embed[rownames(cds_embed),]
plot_cells(cds, reduction_method = "UMAP", color_cells_by = "cell_type",group_label_size=6)
ps("UMAP_CellType.pdf")

## Cluster the cells
cds %<>% rcluster_cells(reduction_method = "UMAP", cluster_method = "leiden",random_seed=1314)
plot_cells(cds, color_cells_by = "partition",group_label_size=6)
ps("UMAP_partition.pdf")

4.Order cells in pseudotime along a trajectory

手動選擇root需要根據(jù)自己的生物學(xué)背景知識

## Learn a graph
cds %<>% learn_graph()
plot_cells(cds,trajectory_graph_segment_size = 1.5,
    group_label_size = 6,group_cells_by="cluster",
    color_cells_by = "cell_type", label_groups_by_cluster = FALSE,
    label_leaves = FALSE, label_branch_points = FALSE
)
ps("UMAP_graph.pdf")
cds <- order_cells(cds)

plot_cells(cds,
    group_label_size = 6,
    color_cells_by = "pseudotime",
    label_cell_groups = FALSE,
    label_leaves = FALSE,
    label_branch_points = FALSE,
    graph_label_size = 1.5
)
ps("Trajectory_Pseudotime.pdf")

The black lines show the structure of the graph.

The black lines show the structure of the graph. Note that the graph is not fully connected: cells in different partitions are in distinct components of the graph. The circles with numbers in them denote special points within the graph. Each leaf, denoted by light gray circles, corresponds to a different outcome (i.e. cell fate) of the trajectory. Black circles indicate branch nodes, in which cells can travel to one of several outcomes.

5.Perform differential expression analysis

這是空間差異分析常用的方法，在空間轉(zhuǎn)錄組中也有,。

morans_Icolumn, which ranges from -1 to +1. A value of 0 indicates no effect, while +1 indicates perfect positive autocorrelation and suggests that nearby cells have very similar values of a gene's expression. Significant values much less than zero are generally rare.

dea_res <- graph_test(cds, neighbor_graph = "principal_graph", cores = n_jobs) %>%
    dplyr::filter(q_value < 0.05)
fwrite(dea_res, "Trajectory_genes.csv")

6.Finding genes that change as a function of pseudotime

尋找擬時(shí)軌跡差異基因,，使用graph_test分析最重要的結(jié)果是莫蘭指數(shù)（morans_I），其值在-1至1之間,，0代表此基因沒有,。空間共表達(dá)效應(yīng),，1代表此基因在空間距離相近的細(xì)胞中表達(dá)值高度相似,。根據(jù)莫蘭指數(shù)挑選前10個基因用于可視化,。

degs <- dea_res$gene_short_name

top_genes <- dea_res %>%
    dplyr::top_n(n = 10, morans_I) %>%
    dplyr::pull(gene_short_name) %>%
    as.character()
plot_genes_in_pseudotime(cds[top_genes, ],
    color_cells_by = "cell_type",
    min_expr = 0.5, ncol = 2
)
ps("Top_Genes_Jitterplot.pdf", w = 9, h = 6)

p <- plot_cells(cds,
    genes = top_genes, show_trajectory_graph = FALSE,
    label_cell_groups = FALSE, label_leaves = FALSE
)
p$facet$params$ncol <- 5
ps("Top_Genes_Featureplot.pdf", plot = p)

7.Finding modules of co-regulated genes

尋找共表達(dá)基因模塊,根據(jù)上邊的差異分析結(jié)果,，按照UMAP和Louvain 聚類,，將這些基因分在不同的模塊中，有些模塊在某些細(xì)胞中特異高表達(dá),。

Once we have a set of genes that vary in some interesting way across the clusters, Monocle provides a means of grouping them into modules. We can call find_gene_modules(), which essentially runs UMAP on the genes (as opposed to the cells) and then groups them into modules using Louvain community analysis

gene_module_df <- find_gene_modules(cds[degs, ], resolution = 1e-2, cores = n_jobs)
fwrite(gene_module_df, "Genes_Module.csv")

cell_group_df <- tibble::tibble(
    cell = row.names(colData(cds)),
    cell_group = colData(cds)$cell_type
)
agg_mat <- aggregate_gene_expression(cds, gene_module_df, cell_group_df)
row.names(agg_mat) <- stringr::str_c("Module ", row.names(agg_mat))

pheatmap::pheatmap(agg_mat,
    cluster_rows = TRUE, cluster_cols = TRUE,
    scale = "column", clustering_method = "ward.D2",
    fontsize = 6
)
ps("heatmap.pdf", w = 5, h = 16)
plot_cells(cds,
    genes = gene_module_df %>% dplyr::filter(module %in% c(200, 169, 138, 22,17,91)),
    group_cells_by = "partition",
    color_cells_by = "partition",
    show_trajectory_graph = FALSE
)
ps("module.pdf")
saveRDS(cds,file="cds.rds")

Reference

https://zhuanlan.zhihu.com/p/451727080
https://cole-trapnell-lab./monocle3/docs
https://www.jianshu.com/p/c402b6588e17

轉(zhuǎn)藏 分享

QQ空間 QQ好友新浪微博微信

獻(xiàn)花（0） +1

來自：生信探索 > 《待分類》

舉報(bào)/認(rèn)領(lǐng)

0條評論

發(fā)表

請遵守用戶評論公約

類似文章 更多

生信探索

關(guān)注對話

TA的最新館藏

頂刊繪圖系列｜GSEA富集
頂刊繪圖系列｜富集條形圖
100個GEO基因表達(dá)芯片或轉(zhuǎn)錄組數(shù)據(jù)處理021.GSE5281
哈佛大學(xué)ChIP_ATAC_CUT&TAG_CUT&RUN教程08_ peak處理
哈佛大學(xué)ChIP/ATAC/CUT&TAG/CUT&RUN教程07: 使用MACS去call peak
pymol可視化｜如何選擇多個氨基酸殘基（HDOCK可視化）

喜歡該文的人也喜歡更多

熱門閱讀換一換

久久国产成人av_抖音国产毛片_a片网站免费观看_A片无码播放手机在线观看,色五月在线观看,亚洲精品m在线观看,女人自慰的免费网址,悠悠在线观看精品视频,一级日本片免费的,亚洲精品久,国产精品成人久久久久久久