美文网首页
近源物种基因集比较预处理问题 2020-04-28

近源物种基因集比较预处理问题 2020-04-28

作者: SnorkelingFan凡潜 | 来源:发表于2020-04-28 19:39 被阅读0次
  • 脚本 step1_com_gff.sh
perl */complete_gene.pl --start 50 --stop 50 */all_maker.f1.filter1.gff  */polishedGenome.fa  > Sp.compare.gff
  • 投递脚本
    qsub step1_com_gff.sh
  • 报错
Argument "." isn't numeric in multiplication (*) at /pl_script/complete_gene.pl line 110.
Argument "." isn't numeric in multiplication (*) at /pl_script/complete_gene.pl line 110.
Argument "." isn't numeric in multiplication (*) at /pl_script/complete_gene.pl line 110.
Argument "." isn't numeric in multiplication (*) at /pl_script/complete_gene.pl line 110.
Argument "." isn't numeric in multiplication (*) at /pl_script/complete_gene.pl line 110.
  • 查看脚本complete_gene.pl
    less -SN complete_gene.pl
110                         my $score = $gene{$chr}{$id}{score} + ($gene{$chr}{$id}{start} + $gene{$chr}{$id}{end}) * 3 / $gene{$chr}{$id}{cds_length} * $gene{$chr}{$id}{score} / 100;
  1. 回头看报错
    Argument "." isn't numeric in multiplication (*) at /pl_script/complete_gene.pl line 110.
    大概意思是complete_gene.pl 第 line 110识别的文件中有不正常的点号
  2. 回头看line 110
    my $score = $gene{$chr}{$id}{score} + ($gene{$chr}{$id}{start} + $gene{$chr}{$id}{end}) * 3 / $gene{$chr}{$id}{cds_length} * $gene{$chr}{$id}{score} / 100
  3. 查看输入的gff文件
ctg278_np512    maker   mRNA    175030  178587  .       +       .       ID=maker-ctg278_np512-augustus-gene-0.14-mRNA-1;
ctg278_np512    maker   CDS     175030  175148  .       +       0       Parent=maker-ctg278_np512-augustus-gene-0.14-mRNA-1;
ctg278_np512    maker   CDS     175317  178587  .       +       1       Parent=maker-ctg278_np512-augustus-gene-0.14-mRNA-1;
ctg278_np512    maker   mRNA    255419  260530  .       +       .       ID=maker-ctg278_np512-augustus-gene-0.15-mRNA-1;
ctg278_np512    maker   CDS     255419  255662  .       +       0       Parent=maker-ctg278_np512-augustus-gene-0.15-mRNA-1;
ctg278_np512    maker   CDS     260163  260530  .       +       2       Parent=maker-ctg278_np512-augustus-gene-0.15-mRNA-1;

发现文件结构中第6列全都是点号

  1. 查看成功运行的gff文件结构
scaffold23_size4198784_pilon    GLEAN   mRNA    3579208 3600109 0.910585        +       .       ID=_GLEAN_10012814;
scaffold23_size4198784_pilon    GLEAN   CDS     3579208 3581773 .       +       0       Parent=_GLEAN_10012814;
scaffold23_size4198784_pilon    GLEAN   CDS     3582784 3583286 .       +       2       Parent=_GLEAN_10012814;
scaffold23_size4198784_pilon    GLEAN   CDS     3596757 3596988 .       +       0       Parent=_GLEAN_10012814;
scaffold23_size4198784_pilon    GLEAN   CDS     3600051 3600109 .       +       2       Parent=_GLEAN_10012814;
scaffold125_size1384276_pilon   GLEAN   mRNA    767573  770133  0.999998        -       .       ID=_GLEAN_10004397;
scaffold125_size1384276_pilon   GLEAN   CDS     770002  770133  .       -       0       Parent=_GLEAN_10004397;
scaffold125_size1384276_pilon   GLEAN   CDS     769820  769906  .       -       0       Parent=_GLEAN_10004397;
scaffold125_size1384276_pilon   GLEAN   CDS     768019  768159  .       -       0       Parent=_GLEAN_10004397;
scaffold125_size1384276_pilon   GLEAN   CDS     767766  767864  .       -       0       Parent=_GLEAN_10004397;
scaffold125_size1384276_pilon   GLEAN   CDS     767573  767680  .       -       0       Parent=_GLEAN_10004397;

发现成功的gff是GLEAN整合的,其mRNA对应的第6列都有值,而报错的gff中是maker整合的,其mRNA对应的第6列没有值

  1. 查看complete_gene.pl中对$gene{$chr}{$id}{score}的定义
75                 $gene{$c[0]}{$id}{score} = $c[5];
76         }elsif($c[2] eq 'CDS'){

发现gff输入文件的第6列对应的是score,需要参与complete_gene.pl中第110行的计算,在elsif($c[2] eq 'CDS')不满足第3列是标注CDS的情况下,如mRNA的时候是必须为数值的,此处为点号,故而报错


建议解决办法

  1. 改写complete_gene.pl脚本,去掉对第6列对应的是score的计算
  2. 或将complete_gene.pl脚本中对score的定义改为适用于maker的AED的值

注:maker程序产生的原始gff文件all_maker.gff才有AED值

##gff-version 3
ctg278_np512    .       contig  1       1283470 .       .       .       ID=ctg278_np512;Name=ctg278_np512
ctg278_np512    maker   gene    175030  178587  .       +       .       ID=maker-ctg278_np512-augustus-gene-0.14;Name=maker-ctg278_np512-augustus-gene-0.14
ctg278_np512    maker   mRNA    175030  178587  .       +       .       ID=maker-ctg278_np512-augustus-gene-0.14-mRNA-1;Parent=maker-ctg278_np512-augustus-gene-0.14;Name=maker-ctg278_np512-augustus-gene-0.14-mRNA-1;_AED=0.26;_eAED=0.19;_QI=0|0|0|0.5|1|1|2|0|1129
ctg278_np512    maker   exon    175030  175148  .       +       .       ID=maker-ctg278_np512-augustus-gene-0.14-mRNA-1:exon:0;Parent=maker-ctg278_np512-augustus-gene-0.14-mRNA-1
ctg278_np512    maker   exon    175317  178587  .       +       .       ID=maker-ctg278_np512-augustus-gene-0.14-mRNA-1:exon:1;Parent=maker-ctg278_np512-augustus-gene-0.14-mRNA-1
ctg278_np512    maker   CDS     175030  175148  .       +       0       ID=maker-ctg278_np512-augustus-gene-0.14-mRNA-1:cds;Parent=maker-ctg278_np512-augustus-gene-0.14-mRNA-1
ctg278_np512    maker   CDS     175317  178587  .       +       1       ID=maker-ctg278_np512-augustus-gene-0.14-mRNA-1:cds;Parent=maker-ctg278_np512-augustus-gene-0.14-mRNA-1

(完)

相关文章

网友评论

      本文标题:近源物种基因集比较预处理问题 2020-04-28

      本文链接:https://www.haomeiwen.com/subject/zzibwhtx.html