Team:CU-Boulder/Model

Modeling

In order to create a linker for the two protein domains, primarily PyRosetta and Pymol were used.

We started by assessing which ends of the protein chains of the AraC domain should be used as the attachment point. Given that AraC's homodimeric structure contains two floppy chains at its N-terminus, and that the C-terminus was at the end of an alpha helix, it was decided that the best attachment point would be at the ends of the two floppy chains.

For the antibody, it was best to use the C-terminus (pink) as the attachment point, as the N-terminus was hidden inside the middle of the protein (orange).

Ideally, the smaller the linker, the more likely the AraC domain will induce the two chains of the antibody to break apart. So we used functions of PyRosetta (docking) and some manual movements on Pymol to close the gap between the two domains, and to rotate the structures accordingly to get the smallest distance between each of the domains' center of masses. Shown below is the resulting structure:

From inspection, an estimated 0 to 20 residues of linker was needed to attach the two domains together, and two linkers would be needed to securely attach the AraC to the antibody. We started by modeling the linker on only one side of the two domains. Twenty-one models were created, the first one starting at 0 residues and subsequent models added a residue until 20 residues was reached. Alternating glycines and serines were used as the residues. Glycine is known for its flexibility, and for later protein purification, it was desired that the linker was soluble in an aqueous solution so serine was used. The linker chain was built starting off of the floppy end of the AraC domain. After the specified length was built, a PyRosetta function was used to connect the end of the linker with the antibody chain.

Once connected, a mutation function on PyRosetta was used to switch out the serines and glycines for residues that better suited the geometry and spacing of the linker. The final step involved the "relax" function in PyRosetta, where the entire structure was slightly adjusted and perturbed to return the structure with the lowest "scorefunction". The scorefunction is a unitless measure of stability, where generally the lower the value, the more stable the structure according to Rosetta's calculations. There are many different ways for Rosetta to calculate the "score" of a structure, but for the purposes of the linker, we used the default full atom energy function. After the structure was scored, a RMSD value was calculated comparing the original conformation of the AraC + Antibody with the newly adjusted conformation that accounts for the linker. This protocol was repeated with all 21 models. In order to narrow down the number of viable models, the following procedure was used:

-If after the linker was connected, the AraC or antibody chains noticeably split apart, the model was eliminated

-If after the mutation function, the linker was entirely made up of non-polar residues, the model was eliminated

-If after the relax function, the linker attachment split apart, the model was eliminated

-With the remaining models, the ones with low RMSD values were selected

This left us with three workable models to use for building the second linker, which were residue lengths 17, 13, and 4. We built the second linker in a very similar fashion to the first, except that not all 21 different models of different residue lengths were built for each of the three models. Due to the symmetry of the AraC domain, we expected each second linker length to be around the same length as the first linker. So for the 4 residue linker model, we only built and tested secondary linkers of lengths 1 - 7. Following the same procedure as before, we ended up with 2 models. One with a 17 and 15 residue linkers, and the other with 4 and 3 residue linkers.

A ramachandran plot was generated for both models to help check feasibility:

For the short linker model, the ramachandran plot showed 98% of the residues fell in the favoured region, and 2% fell in the allow region. For the long linker model, the ramachandran plot showed 96.5% of the residues fell in the favoured region, 3.3% fell in the allowed region, and 0.2% fell in the outlier region. Generally it is expected that 98% of residues fall in the favoured region, 2% fall in the allowed region, and none fall into the outlier region.