美文网首页rna_seqlinux
2020-05-25 circRNA流程

2020-05-25 circRNA流程

作者: my_derek | 来源:发表于2020-05-25 00:22 被阅读0次

data obtain

prefetch --option-file SRR_Acc_List.txt -O 01sra

to fastq 设置ssd tmp文件夹能加快速度 -e 线程

for i in {87..94};do fasterq-dump --split-3 ../01sra/SRR57426$i.sra -e 24 -p  -t /home/derek/Desktop/tmp; done

fastp 查看质控

for i in {87..94};do fastp -i 02fastq/SRR57426${i}.sra_1.fastq -o 03fastp/SRR57426${i}_1.fastq.gz -I 02fastq/SRR57426${i}.sra_2.fastq -O 03fastp/SRR57426${i}_2.fastq.gz -w 8 --html 03fastp/${i}.html --json 03fastp/${i}.json;done

WARNING: fastp uses up to 16 threads although you specified 24 #虽然我设置了24线程,但软件最高支持16线程

根目录 转bam

 for i in {87..94};do bowtie2 -p8  --very-sensitive --score-min=C,-15,0 --mm -x ./hsa/hg19/hg19 -q  -1 02fastq/SRR57426$i.sra_1.fastq -2 02fastq/SRR57426$i.sra_2.fastq 2> 04out/bowtie2_$i.log  | samtools view -hbuS - | samtools sort -o 04out/out$i.bam;done

unmapped

for i in {87..94};do samtools view -hf 4 04out/out$i.bam | samtools view -Sb - > 05unmap/unmapped$i.bam;done

to anchor

for i in {87..94};do python unmapped2anchors.py 05unmap/unmapped$i.bam | gzip > 06anchor/anchors$i.qfa.gz;done

find circ

for i in {87..94};do bowtie2 -p 22 \
--reorder  \
--mm \
--score-min=C,-15,0 \
-q -x hsa/hg19/hg19 \
-U 06anchor/anchors$i.qfa.gz  \
-S 07align/align$i.sam ; done

双循环,剩下的输入文件,估计要30小时

for i in {87..90};do cat 07align/align$i.sam | python  find_circ.py -G hsa/hg19.fa -p hsa_ stats=08find/stats$i.txt --reads=08find/spliced_reads$i.fa >  08find/splice_sites$i.bed;done
for i in {91..94};do cat 07align/align$i.sam | python  find_circ.py -G hsa/hg19.fa -p hsa_ stats=08find/stats$i.txt --reads=08find/spliced_reads$i.fa >  08find/splice_sites$i.bed;done

提取circRNA 候选

for i in {81..87};do grep CIRCULAR 07find/splice_sites$i.bed | grep -v chrM | awk '$5>=2' | grep UNAMBIGUOUS_BP | grep ANCHOR_UNIQUE | python maxlength.py 100000  > 08filter/circ_candidates$i.bed; done

相关文章

网友评论

    本文标题:2020-05-25 circRNA流程

    本文链接:https://www.haomeiwen.com/subject/ekfeahtx.html