Owing to the need for the domain sequences of proteins to build the phylogenetic tree, the following codes were written.
open FA, "$ARGV[0]";
$/=">";
<FA>;
while(<FA>){
    chomp;
    my($id,$seq)=(split /\n/,$_,2)[0,1];
    $seq=~s/\n//g;
    
    $hash{$id}=$seq;
    #print">$id\n$seq\n";
    
    
}
$/="\n";
open IN, "$ARGV[1]";
while(<IN>){
    chomp;
    @temp=split /\t/,$_;
    $length=$temp[2]-$temp[1]+1;
    if(exists $hash{$temp[0]}){
        $sequnce=substr($hash{$temp[0]},$temp[1]-1,$length);
        print ">$temp[0]\n$sequnce\n";
        
        
    }
        
}
input1 file
image.png
input2 file:
image.png
Running the code:
perl .\Domain_seq_extrac.pl  .\ALl_combined_1.txt   .\Domain_for_perl.txt
Results:
image.png












网友评论