It is difficult to maintain the functionality of engineered constructs within bacterial populations because cells often cannot support the additional burden associated with the construct for a sustained number of generations. In order to address this issue, we must quantify the burden imposed upon bacterial cells by constructs. We define burden as the percent reduction in growth rate incurred upon the cell by the construct. In order to quantify burden, we measured the growth rates of the E. coli cells containing these BioBricks and used our R Code pipeline to generate burden values for each construct. Using our burden value, along with our model, we can better understand the evolutionary stability and consequences of using these constructs for real world applications. Synthetic biologists can use these burden measurements to make more informed decisions about what genetic parts they want to use in long-term or large-scale experiments.
Figure 1. Evolutionarily unstable genetic device in E. coli cells. The cells contain a genetic device expressing asPink, a chromoprotein. This genetic device is relatively unstable, as seen by colonies that are no longer colored.
When a genetic device reallocates cellular resources towards itself and away from the native processes essential for cellular growth, those native processes become stunted due to the lack of resources available. This results in slower cell growth. Thus, if mutations arise that render the genetic device nonfunctional, cells containing the nonfunctional device will have more resources available for growth. These cells will quickly outcompete cells with fully functioning genetic devices due to their fitness advantage, yielding a non-functional population. In other words, these devices are unreliable.
In order to identify the evolutionarily unstable parts, we must quantitatively characterize each part’s burden. We define “burden” as the percent reduction in cell growth rate. Parts with greater burden will have a greater reduction in growth rate and therefore be more evolutionarily unstable.
We developed a standardized method to quantitatively and efficiently measure the burden of hundreds of BioBrick parts, which allows us to differentiate between reliable and unreliable parts.
The Burden Assay
Figure 2. Burden assay workflow.
To measure burden, we developed the burden assay (Figure 2). The burden assay is based upon the Ceroni paper and uses a strain of E. coli that contains GFP within its genome. In their paper they used GFP to monitor cellular capacity. In our burden assay, GFP is used to measure a specific type of burden that we describe in a section below. To conduct the assay, we transformed 497 parts from the iGEM Registry into our burden monitor strain. We then measured the OD600 and GFP expression levels of 330 strains containing different BioBrick parts. These data were fed into our R script pipeline to calculate each strain's growth rate and GFP expression rate. These measurements were then used to determine burden values, which we define as percent reduction in growth rate.
In order to calculate the burden value (growth rate reductions), we first had to generate data using a plate reader. Further details about the protocol for collecting plate reader data can be found on the Experiments page↗. In summary, each 96-well plate harbored 23 strains in three different wells, and was grown for at least 6 hours in the plate reader. The OD600 and GFP expression levels were recorded every 10 minutes. Every plate contained controls, including LB only (blanks) and control strains. We used the LB blanks to subtract out background absorbance and fluorescence from each of our measured BioBricks strains. The raw data extracted from this burden assay was put into the Burden.R script (https://github.com/barricklab/igem2019), which calculated the growth rates and GFP expression rates for each measured BioBrick.
Normalization of Growth Rates
Figure 3. Normalization of growth rates of each measured genetic part. On the left, the raw growth rates data from the burden assays. On the right, the graph illustrating the normalized growth rates data. This normalized data is used for further measurements and characterization of burden for each part.
Figure 3 displays the raw growth rate measurements outputted by the plate reader and processed through our script to generate mean growth rate measurements for each strain. As expected, we saw a large variation in the growth rates between experiments. In order to compare each experiment, we needed to normalize the datasets. When normalizing the data, we assumed that most parts are not burdensome. This assumption could be reasonably made because the growth rates were skewed strongly left. In summary, we normalized our strain data to an arbitrary mean maximum growth rate of 1.0. The normalization of the growth rates for each experiment allowed for comparison of growth rates between experiments. This was done using normalize_experiments_with_GFP (1).R script on Github (https://github.com/barricklab/igem2019).
Reduced Growth Rates Show Parts Have Burden
Figure 4. Graph of normalized specific growth rates for each strain isolate.
Using Figure 4, we can assign each part a burden value based on growth rate. This plot illustrates the 330 different BioBricks that were each assayed in triplicate during burden assays. This graph was generated using the normalized growth rates (see Figure 3). Using this data, we were able to identify which parts were burdensome.
Another application of this measurement is when we used the Anderson series to visualize the effects of varying promoter strengths on metabolic burden, specifically translational burden. These parts all shared the same-strength ribosome binding site, but had different levels of promoter strengths upstream of an RFP gene. Using the normalized growth rates for those parts, we calculated their burden values (percent reduction in growth rate), as shown in Table 1 below. These burden values determined by our measurement system allow people to predict how evolutionarily stable the devices will be in real world applications, ranging from cloning to industrial scale fermenters. Thus, having these values at hand is very useful for a diverse array of fields.
GFP Expression Rate Normalization
GFP expression rates were normalized using a method akin to the growth rate normalization method. By normalizing GFP expression rates, we were able to determine whether burden is translational or due to “other” causes. Translational burden is the burden that arises solely from ribosome reallocation. Other burden is the burden that arises from causes other than ribosome reallocation. Examples of other burden include toxicity, metabolic resource misallocation, and population interactions.
Control Strain Normalization
Figure 6. Normalization of control strains from each burden assay. The graph on the left illustrates the regression lines yielded from the control strains of each burden assay. The graph on the right is a product of the normalization, which resulted in a single regression line of control strains. This regression line is used for the graph of all measured BioBricks (shown below).
Using a set of 5 strains as controls, we determined a burden regression fit for our normalized GFP data when plotted against normalized growth rates. We performed over 20 distinct burden assays for these control strains, acquiring the data for Figure 6. From these calculations, we compared the control strain data for each experiment and omitted experiments whose slopes were statistical outliers relative to the rest of the slopes. This was achieved after finding the 95 % confidence interval of a vector of slopes; we omitted experiments whose control strains fell outside of this interval to start. The result of this work is a linear regression for the control strains whose slope was not significantly different from y = x.
Figure 7. Scatter plot of strains containing BioBricks' GFP expression rates as a function of growth rates. Points categorized as having "other burden" have additional burden other than translational burden.
Figure 7 shows the parts that have significant burden associated with them. This plot differentiates between parts with no observable burden (green points) versus those that have a quantifiable burden (red and blue points). All burdensome parts have a significant amount of translational burden. However, the points in red appear to significantly deviate from the regression line and have growth rates significantly lower than what it would be if all of the burden for the part was just translational burden. Thus, we can say that they have some form of “other burden.” Examples of other burden include toxicity, metabolic resource misallocation, and population interactions. In summary, we can use this scatterplot to identify if a part was (1) burdensome, (2) imposed only translational burden, or (3), imposed both translational and other burden.
Figure 8. Density plot of the burden value distribution of Biobricks as the percent reduction in growth rate. The grand means for each of our control strains are indicated with colored lines to show how they contribute to defining a translational burden threshold.
Using all the data from the scatterplot (Figure 7), we then distributed out the burden based on the frequency of that burden value within our data. This plot bolsters the assumption that most parts don't have burden. The control strains, denoted as the vertical lines, allow us to visualize the lower burden values from which we created our regression line. This regression line shows the threshold for which we can differentiate between the different types of burden, translational or other. Generally, we saw very few or no parts that had a burden greater than 40%. Because we have no evidence of parts with a burden higher than 40%, it is likely that this value signifies an “unclonable” region, in which the construct breaks before the culture can reach saturation.
As presented above, we have introduced to iGEM a new type of measurement value, the burden value. This burden value was determined using our metabolic burden assay, which involves measuring OD600 values as well as GFP expression levels and running these values through our R script to generate growth rates and GFP expression rates. During this process, we normalized these rates in various ways.
Burden value is important because it can predict how reliable a part is and, thus, how appropriate a part is for usage in industrial scale bio-fermentation and synthetic biology labs around the world. We present the burden values quantified for 330 parts from the iGEM Registry, including the Anderson series of promoters, which we discuss on this page as well as their respective BioBrick pages.