美文网首页走进转录组RNA-seq
bulk RNA-Seq(7)样本相关性、聚类、PCA分析

bulk RNA-Seq(7)样本相关性、聚类、PCA分析

作者: Bioinfor生信云 | 来源:发表于2022-07-05 17:29 被阅读0次

欢迎关注Bioinfor 生信云微信公众号!

读取三张表

library(tidyverse)
library(readr)
gene_info <- read_delim("MM_6js01hvu.emapper.annotations.tsv", 
delim = "\t", escape_double = FALSE, 
col_names = FALSE, comment = "#", trim_ws = TRUE,row.names = 1) %>% 
select(ID = X1,
GO = X10,
Ko = X12, 
pathway = X13,
         Gene_name = X9)
gene_exp <- read.table('genes.TMM.EXPR.matrix', header=T, row.names = 1)

sample_info <- read.table(file = 'sample.txt', sep = "\t", header=T, row.names = 1)

样本相关性

相关性分析correlation
R语言的cor函数,可以计算变量之间的相关系数


#计算距离
sample_cor <- cor(gene_exp)
sample_cor1 <- round(sample_cor, digits = 2)
#画图
library(pheatmap)
pheatmap(sample_cor1, display_numbers = T,fontsize = 10, angle_col = 45)

聚类树状图

sample_dist <- dist(t(gene_exp))
sample_hc <- hclust(sample_dist)
plot(sample_hc)

PCA


library(PCAtools)p <- pca(gene_exp, metadata = sample_info, removeVar = 0.1)
pca_loadings <- p$loadings #某基因对pc1\pc2\pc3\pc4的贡献
pca_rotated <- p$rotated #每个主成分与样本之间的关系
screeplot(p)  #主成分对样本差异的解释度
biplot(p,
       x = 'PC1',
       y = 'PC2',
       colby = 'group', 
       shape = 'shape',
       legendPosition = 'right')

数据可以保存在rdata格式的文件中,下次直接用load()函数导入使用。

喜欢的话就点个赞吧

相关文章

网友评论

    本文标题:bulk RNA-Seq(7)样本相关性、聚类、PCA分析

    本文链接:https://www.haomeiwen.com/subject/zkisbrtx.html