Team:Ruperto Carola/MMT

Modelling evolution

Usually, models of cell growth and gene expression assume a fixed genetic background for the cells involved. This does not play well with the idea of evolution, introducing mutation and selecion. Once we add that, standard models break down. To properly model evolution, we need to integrate both the behaviour of our cells' gene regulatory networks and their competition in the context of a larger population. Understanding the model of this at once can be quite the brain-melter, so let us take this step-by-step. We begin by setting down the equations for the population dynamics of a single type of cells, assuming our media can sustain the growth of infinitely many cells. As we're looking at the dynamics of our population \(N : T \to C\) here, we're looking at the rate of change in cell concentration in time, which is made up of a rate of cell division \(\gamma : \mathbb{R}_+\) and a death rate \(\delta : \mathbb{R}_+\). \[ \frac{dN}{dt} = \gamma N - \delta N \] Such media of course do not exist, and realistic conditions dictate a finite carrying capacity \(K : C\). We can easily adjust above equation to enforce a given \(K\) using logistic growth \[ \frac{dN}{dt} = (\gamma - \delta) \cdot N \left( 1 - \frac{N}{K} \right) \] or by explicitly coupling to nutrient concentration \(S : T \to C\) via the Monod equations: \[ \frac{dN}{dt} = \frac{\gamma S}{K + S} N - \delta N \] \[ \frac{dS}{dt} = -\upsilon \frac{\gamma S}{K + S} N \] where \(\upsilon : \mathbb{R}_+\) is a yield factor giving nutrient consumption per unit growth. That's it for the basics. To take the first step towards modelling evolution, consider a set of possible genotypes \(\Gamma : \mathsf{FinSet}\), and the concentration of a subpopulation of a given genotype \(N : \Gamma \to T \to C\), with genotype-specific growth rate \(\gamma : \Gamma \to \mathbb{R}_+\). You can think of differences in \(\gamma\) across \(\Gamma\) as differences in fitness between the genotypes. Thus, subpopulations of fixed genotype compete for nutrients in media: \[ \frac{dN_k}{dt} = \frac{\gamma_k S}{K + S} N_k - \delta N_k \] At this point, we have recovered several aspects of evolving systems – the presence of multiple genotypes, as well as selection via competition of multiple subpopulations for growth and media – but we are still missing one last aspect, without which everything breaks down. We are missing mutation, which provides a way for individuals to move between genotypes \(k : \Gamma\). In most systems, DNA mutation is tightly coupled to DNA replication, which is in turn coupled to cell division. Thus, we want an extension to our model, which moves individuals across genotypes as they divide. We can achieve this by introducing an additional term \(\pi_{k' \to k} : \Gamma \to \Gamma \to \Delta^1\) giving the probability of mutating from genotype \(k'\) to \(k\), at each cell division. Coupled into our model, we arrive at: \[ \frac{dN_k}{dt} = \sum_{k' : \Gamma} \pi_{k' \to k} \cdot \frac{\gamma_{k'} S}{K + S} \cdot N_{k'} - \delta N_k \] where we gather up contributions of mutation from each subpopulation by summing over \(k' : \Gamma\). Now, we have all the parts and we can rewrite this equation into a more succinct, descriptive form: \[ \frac{d\mathbf{N}}{dt} = \gamma_0 \mathbf{\pi}\mathbf{f} \mathbf{N} - \delta \mathbf{N} \] where \(\mathbf{N} : \Gamma \to T \to C\) is the vector of concentrations \(N_k\) for each subpopulation of genotype \(k : \Gamma\), \(\gamma_0 : \mathbb{R}_+\) a maximum growth rate, \(\mathbf{f} : \Gamma \to (0, 1)\) a relative fitness (relative growth rate) and mutation matrix \(\mathbf{\pi} : \Gamma \to \Gamma \to \Delta^1\) with elements \(\pi_{k' \to k}\). Note that for evolution of traits involving interaction between cells, the fitness \(\mathbf{f}\) may have an arbitrary dependence on \(\mathbf{N}\).

Now, the observant reader may wonder, where we get our fitness \(\mathbf{f}\) from. Depending on the level of abstraction we want to treat our system at, we may indeed directly specify a function \(f\) from genotypes \(\Gamma\) to fitness values \((0, 1)\). However, we may also specify actual gene regulatory networks in our cells, and work our way up from there. Briefly, the second case involves treating gene expression as a fast process, relative to cell division and assuming partial steady state behaviour. For more information, take a look at [gene regulatory networks]().

Our model differs from the majority of the literature on modelling evolution, as we focus on non-saturated liquid-culture with population growth. To see how our model relates to the literature, refer to [game theory]().

Having laid the foundations for our model, let us now move on to a specific system for directed evolution in yeast.

Basic Notation

\( \mathbb{R} \) the type of real numbers

\( \mathbb{N} \) the type of natural numbers

\( \Delta^k \) the \(k\)-simplex \(\{x : [0, 1]^k\;|\;\sum_i x_i = 1\}\).

\( \mathsf{FinSet} \) the type of finite sets

\( \mathbb{R}_+ \) the type of real numbers greater than or equal to zero \(\left\{ x : \mathbb{R}\;|\; 0 \leq x \right\}\)

\( T := \mathbb{R}_+ \) the type of time

\( C := \mathbb{R}_+ \) the type of concentration

\( \Gamma : \mathsf{FinSet} \) the type of genotypes

Model Notation

\( T := \mathbb{R}_+ \) the type of time

\( C := \mathbb{R}_+ \) the type of concentration

\( \Gamma : \mathsf{FinSet} \) the type of genotypes