最近在画UMAP的时候发现有的时候细胞亚群的注释与点重合颜色上不是很搭配,同事提出让注释“支棱”起来,首先想到的是ggforce中的geom_mark_ellipse ,实践中遇到一些问题(比如,ggforce会受outlier影响,看起来比较乱),于是有了这一篇Single cell的记录。 ggforee 受outlier影响尝试用ggforce注释library(dplyr) library(Seurat) library(SeuratData) library(patchwork) library(ggforce) ##InstallData("pbmc3k") data("pbmc3k")
points <- data.frame(pbmc3k.final@reductions$umap@cell.embeddings, cluster=Idents(pbmc3k.final)) DimPlot(pbmc3k.final) + geom_mark_ellipse(data=points, aes(x=UMAP_1, y=UMAP_2, label=cluster, col=cluster), inherit.aes = F) + NoLegend()
版本一非常难看不是吗?因为有一些cluster(Naive CD4 T)存在异常值,ggforce中的函数会包含所有的点。所以应该将异常值去掉,这个方法有很多,我使用的是之前用到的置信椭圆的方法。 修改思路如下: - 用置信椭圆上的点来画geom_mark_ellipse
points <- data.frame(pbmc3k.final@reductions$umap@cell.embeddings, cluster=Idents(pbmc3k.final)) ## adapted from https://github.com/fawda123/ggord/blob/master/R/ggord.R theta <- c(seq(-pi, pi, length = 50), seq(pi, -pi, length = 50)) circle <- cbind(cos(theta), sin(theta)) library(plyr) aux <- function(x, one, two, prob=0.8) { if(nrow(x) <= 2) { return(NULL) } sigma <- var(cbind(x[,one], x[,two])) mu <- c(mean(x[,one]), mean(x[,two])) ed <- sqrt(qchisq(prob, df = 2)) data.frame(sweep(circle %*% chol(sigma) * ed, 2, mu, FUN = '+')) } ell <- plyr::ddply(points, "cluster", aux, one="UMAP_1", two="UMAP_2") DimPlot(pbmc3k.final) + geom_mark_ellipse(data=ell, aes(x=X1, y=X2, label=cluster, col=cluster), inherit.aes = F) + NoLegend()
版本二微调下面就是进行一些微调,将椭圆缩小使注释指在亚群上更好的位置 ## 调整prob参数 ell <- plyr::ddply(points, "cluster", aux, one="UMAP_1", two="UMAP_2", prob=0.1) DimPlot(pbmc3k.final) + geom_mark_ellipse(data=ell, aes(x=X1, y=X2, label=cluster, col=cluster), inherit.aes = F) + NoLegend()
![](http://image109.360doc.com/DownloadImg/2021/08/2619/229198221_6_20210826070039291_wm) 把椭圆隐藏 DimPlot(pbmc3k.final) + geom_mark_ellipse(data=ell, aes(x=X1, y=X2, label=cluster, group=cluster), color=NA, inherit.aes = F) + NoLegend()
|