R语言多线程运算

作者: 落寞的橙子 | 来源:发表于2019-09-25 11:26 被阅读0次

如果你有台好电脑,R语言多线程绝对打开一个新的世界,而如果有台超级计算机,恭喜你,要上天了。
R多线程其实就是基于向量化,说白了就是要避免For循环,利用apply 的方法改造你的函数。
For循环的作用就是反复读取数据,而apply函数则跳过这一步,直接让R去输入For循环想要输入的数据,理解了这一点,先封装函数,然后再调用多线程的apply的方式。
多线程有多种方法,容易搜到的教程我简单列两个,但是核心就是要封装函数,避免For,然后调用多线程apply,后再将每一个输入数据的结果在合并在一起。也许我表达的还不够清楚,但是请把这点留在心里,也许以后看多了就会有一点帮助。
教程一
教程二
教程三
我的一个例子:

#!/usr/bin/env Rscript
#This is a script using for identify targets
#Shoud define the focus gene name
#
library(parallel)
input_dir="~/met_ccle/"
out_dir<-"~/identify_targets/"
met_cor<-read.table(paste0(input_dir,"all_list.tsv"),header = F)
targets<-unlist(lapply(as.character(met_cor[,1]), FUN = function(x) {return(strsplit(x, split = "_",fixed = T)[[1]][1])}))
focus_gene="geneName"

identify_targets<-function(target){
  focus_gene="geneName"
  Cor_dir<-"~/Cor2/"
  rt_target<-read.csv(paste0(Cor_dir,focus_gene,"_cor.csv"),header = T)
  rt<-read.csv(paste0(Cor_dir,target,"_cor.csv"),header = T)
  rt_target_pos<-rt_target[rt_target$pvalue<0.05 & rt_target$cor>0,]
  rt_target_neg<-rt_target[rt_target$pvalue<0.05 & rt_target$cor>0,]
  rt_target_pos_list<-as.character(rt_target_pos$j)
  rt_target_neg_list<-as.character(rt_target_neg$j)
  rt_pos<-rt[rt$pvalue<0.05 & rt$cor>0,]
  rt_neg<-rt[rt$pvalue<0.05 & rt$cor>0,]
  rt_pos_list<-as.character(rt_pos$j)
  rt_neg_list<-as.character(rt_neg$j)
  pos_overlap<-length(intersect(rt_target_pos_list,rt_pos_list))
  neg_overlap<-length(intersect(rt_target_neg_list,rt_neg_list))
  total_overlap<-pos_overlap+neg_overlap
  return(c(focus_gene,target,pos_overlap,neg_overlap,total_overlap))
}
cl <- makeCluster(19)
results <- parLapply(cl,targets,identify_targets)
res.df <- do.call('rbind',results) 
stopCluster(cl)
colnames(res.df)<-c("focus_gene","target","pos_overlap","neg_overlap","total_overlap")
write.csv(res.df,paste0(out_dir,focus_gene,"_identified_targets.csv"))
调用19核心后运行情况如下

相关文章

网友评论

    本文标题:R语言多线程运算

    本文链接:https://www.haomeiwen.com/subject/xblaectx.html