美文网首页
pyrosetta-13Point Mutation Scan

pyrosetta-13Point Mutation Scan

作者: 学了忘了学 | 来源:发表于2023-02-07 22:07 被阅读0次

Overview

本节的目的是使用您在前几节中学习的内容创建协议。在本教程中,我们将准备抗体抗原结合结构,在抗体上进行点突变,记录结合强度的变化,并生成显示能量差异的热图。该方法广泛用于抗体界面设计,以提高结合亲和力或扩大结合范围。

Protocol

整个Protocol分为八个步骤:

  1. 使用FastRelax()准备结构
  2. 编写函数以执行突变PackMover()
  3. 编写函数以解除绑定抗体抗原结合结构unbind()
  4. 编写获取野生型氨基酸的函数
  5. 编写函数以正确变异和pack特定的残基并输出能量情况
  6. 循环通过接口位置,用输出文件将它们变为20个氨基酸
  7. 汇总结合能分析的所有输入文件

Loading structures

from pyrosetta import * 
init()
pose = pose_from_pdb("inputs/1jhl.clean.pdb")
testPose = Pose()
testPose.assign(pose)
print(testPose)
PDB file name: inputs/1jhl.clean.pdb
Total residues: 353
Sequence: DIELTQSPSYLVASPGETITINCRASKSISKSLAWYQEKPGKTNNLLIYSGSTLQSGIPSRFSGSGSGTDFTLTISSLEPEDFAMYICQQHNEYPWTFGGGTKLEIKRQVQLQQSGAELVRPGASVKLSCKASGYTFISYWINWVKQRPGQGLEWIGNIYPSDSYTNYNQKFKDKATLTVDKSSSTAYMQLSSPTSEDSAVYYCTRDDNYGAMDYWGQGTTVTVKVYGRCELAAAMKRMGLDNYRGYSLGNWVCAAKFESNFNTGATNRNTDGSTDYGILQINSRWWCNDGRTPGSKNLCHIPCSALLSSDITASVNCAKKIVSDGDGMNAWVAWRKHCKGTDVNVWIRGCRL
Fold tree:
FOLD_TREE  EDGE 1 108 -1  EDGE 1 109 1  EDGE 109 224 -1  EDGE 1 225 2  EDGE 225 353 -1 

Step 1. Prepare the starting structure with FastRelax()

适当relax结构对于Rosetta的设计至关重要。非松弛结构可能无法很好地克服不良的全局能量,因此会影响以下关于结合能的分析。
FastRelax()用于relax结构。虽然我们希望将结构置于最低能量状态,但我们希望尽可能保留晶体结构中的骨架信息(最低RMSD)。因此,我们将constrain_relax_to_start_coords(True)应用于FastRelax()。
由于FastRelax()占用了大量资源,运行它似乎会使notebook崩溃,因此我们删除了“应用”部分(执行松弛的部分),并打印出松弛Mover()。我们将松弛的结构上传到输入文件夹,以便进一步分析。

坐标约束relax是一种常规准备。有关此准备方法的更高级版本,请参阅:https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0059004

from pyrosetta.rosetta.protocols.relax import FastRelax

relax = FastRelax()
scorefxn = get_fa_scorefxn()
relax.set_scorefxn(scorefxn)
relax = rosetta.protocols.relax.FastRelax()
relax.constrain_relax_to_start_coords(True)
print(relax)
#relax.apply(testPose)
#testPose.dump_pdb('test.relax.pdb')

Writing Function in Python

函数是组织代码的好方法。从本节开始,我将介绍一些功能来促进协议。

要在python中定义函数,需要使用“def”关键字。函数既可以返回值,也可以简单地执行代码块。定义的函数也可以在主函数或代码的其他部分中调用。

Step 2. Write the function for mutation

此函数利用了先前教程中演示的ResidueSelectorMover()。允许设计和repack突变位置,而附近的残留物仅限于repack。最后的突变将使用PackMover()执行。

from pyrosetta.rosetta.core.pack.task import *
from pyrosetta.rosetta.protocols import *
from pyrosetta.rosetta.core.select import *

def pack(pose, posi, amino, scorefxn):

    # Select Mutate Position
    mut_posi = pyrosetta.rosetta.core.select.residue_selector.ResidueIndexSelector()
    mut_posi.set_index(posi)
    #print(pyrosetta.rosetta.core.select.get_residues_from_subset(mut_posi.apply(pose)))

    # Select Neighbor Position
    nbr_selector = pyrosetta.rosetta.core.select.residue_selector.NeighborhoodResidueSelector()
    nbr_selector.set_focus_selector(mut_posi)
    nbr_selector.set_include_focus_in_subset(True)
    #print(pyrosetta.rosetta.core.select.get_residues_from_subset(nbr_selector.apply(pose)))

    # Select No Design Area
    not_design = pyrosetta.rosetta.core.select.residue_selector.NotResidueSelector(mut_posi)
    #print(pyrosetta.rosetta.core.select.get_residues_from_subset(not_design.apply(pose)))

    # The task factory accepts all the task operations
    tf = pyrosetta.rosetta.core.pack.task.TaskFactory()

    # These are pretty standard
    tf.push_back(pyrosetta.rosetta.core.pack.task.operation.InitializeFromCommandline())
    tf.push_back(pyrosetta.rosetta.core.pack.task.operation.IncludeCurrent())
    tf.push_back(pyrosetta.rosetta.core.pack.task.operation.NoRepackDisulfides())

    # Disable Packing
    prevent_repacking_rlt = pyrosetta.rosetta.core.pack.task.operation.PreventRepackingRLT()
    prevent_subset_repacking = pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset(prevent_repacking_rlt, nbr_selector, True )
    tf.push_back(prevent_subset_repacking)

    # Disable design
    tf.push_back(pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset(
        pyrosetta.rosetta.core.pack.task.operation.RestrictToRepackingRLT(),not_design))

    # Enable design
    aa_to_design = pyrosetta.rosetta.core.pack.task.operation.RestrictAbsentCanonicalAASRLT()
    aa_to_design.aas_to_keep(amino)
    tf.push_back(pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset(aa_to_design, mut_posi))
    
    # Create Packer
    packer = pyrosetta.rosetta.protocols.minimization_packing.PackRotamersMover()
    packer.task_factory(tf)

    #Perform The Move
    if not os.getenv("DEBUG"):
        packer.apply(pose)

#Load the previously-relaxed pose.
relaxPose = pose_from_pdb("inputs/1jhl.relax.pdb")

#Clone it
original = relaxPose.clone()
scorefxn = get_score_function()
print("\nOld Energy:", scorefxn(original),"\n")
pack(relaxPose, 130, 'A', scorefxn)
print("\nNew Energy:", scorefxn(relaxPose),"\n")

#Set relaxPose back to original
relaxPose = original.clone()

Old Energy: -1056.7705978308402
New Energy: -1057.2353638196532

Step 3. unbind()

该函数用于结合能分析。为了计算结合能,我们需要对结合态结构的总能量进行评分,分离(解除结合)抗原和抗体,然后对未结合态总能量进行打分。bound energy - unbound energy。您还可以使用位于rosetta.protocols.analysis中的InterfaceAnalyzerOver(IAM)。

from pyrosetta.rosetta.protocols import *

def unbind(pose, partners):
    STEP_SIZE = 100
    JUMP = 2
    docking.setup_foldtree(pose, partners, Vector1([-1,-1,-1]))
    trans_mover = rigid.RigidBodyTransMover(pose,JUMP)
    trans_mover.step_size(STEP_SIZE)
    trans_mover.apply(pose)


##Reset the original pose
relaxPose = original.clone()

scorefxn = get_score_function()
bound_score = scorefxn(relaxPose)
print("\nBound State Score",bound_score,"\n")
unbind(relaxPose, "HL_A")
unbound_score = scorefxn(relaxPose)

print("\nUnbound State Score", unbound_score,"\n")
print('dG', bound_score - unbound_score, 'Rosetta Energy Units (REU)')
Bound State Score -1056.7706161583792 

Unbound State Score -998.7222028352029 

dG -58.048413323176305 Rosetta Energy Units (REU)

Step 4. wildtype()

评估结合改善的一个重要指标是突变体结合能与野生型结合能的比率。此函数返回给定位置的野生型氨基酸。

def wildtype(aatype = 'AA.aa_gly'):
    AA = ['G','A','L','M','F','W','K','Q','E','S','P'
            ,'V','I','C','Y','H','R','N','D','T']

    AA_3 = ['AA.aa_gly','AA.aa_ala','AA.aa_leu','AA.aa_met','AA.aa_phe','AA.aa_trp'
            ,'AA.aa_lys','AA.aa_gln','AA.aa_glu', 'AA.aa_ser','AA.aa_pro','AA.aa_val'
            ,'AA.aa_ile','AA.aa_cys','AA.aa_tyr','AA.aa_his','AA.aa_arg','AA.aa_asn'
            ,'AA.aa_asp','AA.aa_thr']

    for i in range(0, len(AA_3)):
        if(aatype == AA_3[i]):
            return AA[i]

print(wildtype(str(relaxPose.aa(130))))

Setp 4a. Exploration.

我们也可以使用一些实用函数来代替这个函数。

我们可以使用返回的序列STRING。注意,在Rosetta中,字符串的索引为0。

print(relaxPose.sequence())
print("\n",relaxPose.sequence()[2])
DIELTQSPSYLVASPGETITINCRASKSISKSLAWYQEKPGKTNNLLIYSGSTLQSGIPSRFSGSGSGTDFTLTISSLEPEDFAMYICQQHNEYPWTFGGGTKLEIKRQVQLQQSGAELVRPGASVKLSCKASGYTFISYWINWVKQRPGQGLEWIGNIYPSDSYTNYNQKFKDKATLTVDKSSSTAYMQLSSPTSEDSAVYYCTRDDNYGAMDYWGQGTTVTVKVYGRCELAAAMKRMGLDNYRGYSLGNWVCAAKFESNFNTGATNRNTDGSTDYGILQINSRWWCNDGRTPGSKNLCHIPCSALLSSDITASVNCAKKIVSDGDGMNAWVAWRKHCKGTDVNVWIRGCRL

E

现在让我们从保存化学信息的ResidueType对象中获取一些信息。

print(relaxPose.residue_type(1))
ASP:NtermProteinFull (ASP, D):
Base: ASP
 Properties: POLYMER PROTEIN CANONICAL_AA LOWER_TERMINUS SC_ORBITALS POLAR CHARGED NEGATIVE_CHARGE METALBINDING ALPHA_AA L_AA
 Variant types: LOWER_TERMINUS_VARIANT
 Main-chain atoms:  N    CA   C  
 Backbone atoms:    N    CA   C    O   1H   2H   3H    HA 
 Side-chain atoms:  CB   CG   OD1  OD2 1HB  2HB 

最后,让我们从类型中获取残留物

resT = relaxPose.residue_type(1)

print(resT.base_name())
print(resT.name3())
print(resT.name1())
ASP
ASP
D

Step 5. Integrate functions for mutate and output

def mutate(pose, posi, amino, partners):
    #main function for mutation
    CSV_PREFIX = 'notec'
    PDB_PREFIX = 'notep'

    #Initiate test pose
    testPose = Pose()
    testPose.assign(pose)

    #Initiate energy function
    scorefxn = get_fa_scorefxn()
    unbind(testPose, partners)
    native_ub = scorefxn(testPose)
    testPose.assign(pose)
    
    #Variables initiation
    content = ''
    name = CSV_PREFIX + str(posi)+str(amino) + '.csv'
    pdbname = PDB_PREFIX + str(posi)+str(amino) + '.pdb'
    wt = wildtype(str(pose.aa(posi)))
    
    # energy before mutaion
    pack(testPose, posi, amino, scorefxn)
    testPose.dump_pdb(pdbname)
    bound = scorefxn(testPose)
    unbind(testPose, partners)
    unbound = scorefxn(testPose)
    binding = unbound - bound
    testPose.assign(pose)
    
    # energy after mutaion
    if (wt == amino):
        wt_energy = binding
    else:
        pack(testPose, posi, wt, scorefxn)
        wtbound = scorefxn(testPose)
        unbind(testPose, partners)
        wtunbound = scorefxn(testPose)
        wt_energy = wtunbound - wtbound
        testPose.assign(pose)

    content=(content+str(pose.pdb_info().pose2pdb(posi))+','+str(amino)+','
              +str(native_ub)+','+str(bound)+','+str(unbound)+','+str(binding)+','
              +str(wt_energy)+','+str(wt)+','+str(binding/wt_energy)+'\n')

    f = open(name,'w+')
    f.write(content)
    f.close()

mutate(relaxPose, 130, 'A', 'HL_A')

Step 6. Loop through interface positions

以下代码循环通过所有重链和轻链位置,将它们突变为所有20个氨基酸。实际运行可能再次使笔记本崩溃。

您可以任意输出数据。pose.scores()将具有数据、pandas Dataframes数据帧。在这里,我们将简单地以csv文件的形式输出每个突变的能量信息,并将它们分类到一个文件中以供将来分析。将输出文件上载到输入文件夹以进行R分析。

在实际工作环境中,可以实现并行化。请参见PyRosetta一章。对于此任务,您不一定需要完成整个循环。我们在inputs文件夹中有一个完成的版本。

#NOTE - This takes a very long time!!  
# You may skip this block to continue the tutorial with pre-configured outputs.
if not os.getenv("DEBUG"):
    for i in [92,85,68,53,5,45,44,42,32,31,22,108,100]:
        print("\nMutating Position: ",str(i),"\n")
        for j in ['G','A','L','M','F','W','K','Q','E','S','P','V','I','C','Y','H','R','N','D','T']:
            mutate(relaxPose, i, j, 'HL_A')

Analysis of binding data

在收集汇总的结合能量信息后,我们使用panda进行过滤和可视化。我们过滤掉了较低的非束缚态能量结构,即那些非束缚态总能量高于原生态的结构,并从过滤后的数据中生成热图。如果您不想在第8步完成for循环,我们将合并输出csv的完成版本上载到名为“note_output.csv”的输入文件夹。

#import modules need for analysis
import pandas as pd
%matplotlib inline
import matplotlib.pyplot as plt 
import seaborn as sns
csv_file_name = 'inputs/note_output.csv'
#Generating heatmaps

UNBOUND_CUTOFF = -995
RATIO_CUTOFF = 1.001

#load the data into a pandas dataframe
df = pd.read_csv(csv_file_name, names='Position Amino.Acid Native Bound Unbound Binding WT_Binding WT Ratio'.split(), index_col=False )
#Add wildtype AA to position (for display)
df['Position_WT_aa'] = df.Position + ' (' + df.WT  + ')' 

#filter values
df = df.query(f'Unbound<{UNBOUND_CUTOFF} and Ratio>{RATIO_CUTOFF}')

# convert from tidy format (https://en.wikipedia.org/wiki/Tidy_data) to a wider format
df = pd.pivot_table(df, values='Ratio', 
                     index=['Position_WT_aa'], 
                     columns='Amino.Acid')

#plot the heatmap
f, ax = plt.subplots(figsize=(10, 10))
sns.heatmap(df, cmap='coolwarm', linewidths= 1, linecolor='lightgray')
plt.xlabel('Amino acid mutation');
plt.ylabel('Position chain and wild type AA');
sns.set_context("talk") #make labels bigger

参考

Jupyter Notebook Viewer (nbviewer.org)

相关文章

  • Vuex使用介绍(二)

    Mutation Vuex 通过commit一个mutation来对state进行更改。每个 mutation 都...

  • action和mutation区别

    action中处理异步,mutation不可以mutation做原子操作action可以整合多个mutation的...

  • Vuex

    getter mutation and action // actions // mutation是同步操作,如果...

  • TMB/TML

    肿瘤突变负荷(Tumor Mutation Burden,TMB // Tumor Mutation Load(...

  • java基础第七天

    1.jdk提供的方法: scan.nextInt() scan.nextDouble(); scan.next()...

  • vuex Mutation

    Mutation 更改Vuex的store中的状态的唯一方法是提交mutation。Vuex中的mutation非...

  • Oracle中SCAN命令用法

    1、查看SCAN VIP配置 srvctl config scan ---集群名及3个SCAN VIP的配置。...

  • RxJava2.x-scan语法

    一、scan语法 日志 二、scan语法2 日志 总结 1、scan(BiFunction ac...

  • TMB、Checkpoint

    肿瘤突变负荷,英文全称Tumor Mutation Burden(TMB)或Tumor Mutation Load...

  • MutationObserver 监听DOM树变化

    1 概述 Mutation observer 是用于代替 Mutation events 作为观察DOM树结构发生...

网友评论

      本文标题:pyrosetta-13Point Mutation Scan

      本文链接:https://www.haomeiwen.com/subject/igixkdtx.html