The role of whole genome doubling in cancer evolution - Quim Martí Baena - e ...

 
CONTINUE READING
The role of whole genome doubling in cancer evolution - Quim Martí Baena - e ...
BACHELOR´S THESIS / BIOMEDICAL ENGINEERING 2021

                                                  The role of whole genome doubling in cancer
                                                  evolution

                                                  Quim Martí Baena
The role of whole genome doubling in cancer evolution - Quim Martí Baena - e ...
The role of whole genome doubling in cancer
                  evolution

            Quim Martí Baena

       Bachelor's Thesis UPF 2020/2021

                 Thesis Supervisor(s):

          Dr. Ricard Solé ,   (Department CEXS)

       PhD(c). Guim Aguadé ,      (Department CEXS)
The role of whole genome doubling in cancer evolution - Quim Martí Baena - e ...
The role of whole genome doubling in cancer evolution - Quim Martí Baena - e ...
Dedicatory
To my family and friends, and more especially my parents, for making me interested in
            science since I was a kid. This work is dedicated to all of you.
The role of whole genome doubling in cancer evolution - Quim Martí Baena - e ...
The role of whole genome doubling in cancer evolution - Quim Martí Baena - e ...
Acknowledgments
First, I will thank my supervisors, who have helped me throughout this project. Thank
you Guim for your work as a teacher in the complex diseases subject, where I became
passionate about cancer evolution, and more especially, whole genome doubling. Without
these, I would not have done this Bachelor's thesis. I also want to thank you for your
guidance throughout the project. You have taught me to be a better researcher.

I also want to thank Ricard, who has allowed me to be a part of, in my opinion, one
of the best research groups in Barcelona, the complex systems lab. Thank you for your
knowledge and guidance throughout this project.

I want to thank also Frederic and Blai for accepting to review my Bachelor's thesis.

In addition, I want to thank all the members of the Hormonic project (Tomas, Jaume,
Miriam, Andreu and Edu).     It has been challenging to develop this project in parallel
with our Bachelor's thesis, but the talks and ideas brought during breaks have made it
all worth it.

Finally, I want to thank my family and friends for their support throughout this journey.
I want to thank my mother, especially, who helped me tackle genetic databases and re-
viewed my work.
The role of whole genome doubling in cancer evolution - Quim Martí Baena - e ...
The role of whole genome doubling in cancer evolution - Quim Martí Baena - e ...
Summary/Abstract
Whole genome doubling (WGD) is one of the most common events in the early stages of
cancer evolution. However, a consistent explanation for the pervasiveness of WGD across
cancer types remains elusive. The duplication of the whole karyotype, produced by errors
in cell division, is often followed by an increase in chromosomal instability (CIN) and
intratumor heterogeneity, possibly allowing cancer cells to rapidly evolve and overcome
selective barriers. This would explain why WGD has been associated with poor prognosis
and multi-drug resistance along several cancer types, but it is not sucient to account
for why WGD arises and is selected for even before the onset of CIN. In this work, a
mathematical framework to model instability in the cancer genome is presented, inspired
by early virus mutagenesis models. By considering the intertwined eects of ploidy and
mutational rates in a simplied genome, the model is able to capture how the average
chromosome number correlates with potential evolvability.     This, in turn, might point
towards WGD providing a buering eect to cancer cells that could allow the presence of
the increased genome instability that is produced by CIN. In addition, our model indicates
that increasing ploidy values does not allow tumors to explore much higher microsatellite
instability levels, indicating that WGD might only be an evolutionary advantage in the
presence of chromosomal instability. This result sheds light on the previously unresolved
question of why WGD is an uncommon event in MMR-decient cancers.

Keywords

Reliability Theory, Cancer Evolution, Genome Instability, Cancer Aneuploidy, Whole
Genome Doubling
The role of whole genome doubling in cancer evolution - Quim Martí Baena - e ...
The role of whole genome doubling in cancer evolution - Quim Martí Baena - e ...
Prologue
Cancer is a disease closely linked with an evermore aging population [1].       As life ex-
pectancy increases, the odds of having these types of diseases throughout life will grow
with time. Since the discovering of cancer, research institutions have been trying to nd
a universally reliable treatment.   One of the causes of having yet not found a suitable
treatment is tackling the vast complexity and variability of cancer.

On the one hand, cancer can arise in virtually every tissue of a person's body. This gives
dierent initial conditions and environments where evolution can act in dierent direc-
tions. On the other hand, in cancers of the same type, a noticeable inter-heterogeneity
has also been detected, where the majority of mutations have a very low incidence in the
whole cancer population [2]. On top of that, even in a single cancer population of the
same patient, the overwhelming heterogeneity in the genome of dierent cancer cells is
thought to be the cause of drug resistance and malignancy in tumors [3].

This genetic heterogeneity produces not only that cells between a single tumor are dif-
ferent, but the processes and changes in the genome that are key for the generation of a
specic type of cancer might not be present in another type of cancer [2]. Although this
variability exists, there are specic processes that aect a cancer cell's genome, which are
commonly found because they are essential to generate the Hallmarks of cancer. These
are dened as certain functions which cancer cells need to obtain in order to survive, di-
vide indenitely and invade tissues, among others [4]. The most common event on cancer
is the mutation of   TP53, closely followed by whole genome doubling (WGD) [5]. How-
ever, although   TP53 has been thoroughly researched since its discovery of its importance
in cancer, WGD prevalence is still an unresolved mystery that only now is beginning to
unravel.

In order to understand the role of WGD in cancer, there is a need to comprehend the
non-trivial, combined eect of the ploidy and the mutation rate on tumor evolution.
Mathematical modeling is now considered a valuable tool that allows for the characteriza-
tion of the behavior of living systems. With time, modeling has become a highly accepted
scheme in cancer research. For example, it allowed for a minimal characterization of the
interactions between a tumor and the immune system, among other important discover-
ies [6]. Thus, nowadays, modeling represents an important tool that can help us dene
phenomena, even in the midst of the complex genetic heterogeneity found throughout
cancer.
Index
1 Introduction                                                                                  1
  1.1   Cancer, a disease of the genome . . . . . . . . . . . . . . . . . . . . . . . .         1
        1.1.1   Evolutionary footprints in the cancer genome         . . . . . . . . . . . .    2
        1.1.2   Microsatellite instability   . . . . . . . . . . . . . . . . . . . . . . . .    3
        1.1.3   Chromosomal instability . . . . . . . . . . . . . . . . . . . . . . . .         4
        1.1.4   Whole genome doubling        . . . . . . . . . . . . . . . . . . . . . . . .    5
  1.2   State of art   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    6
        1.2.1   Whole genome doubling models . . . . . . . . . . . . . . . . . . . .            6
        1.2.2   Negative selection models      . . . . . . . . . . . . . . . . . . . . . . .    7
  1.3   Scope of the project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      8

2 Methods                                                                                       9
  2.1   Genome model       . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    9
  2.2   Mutation accumulation models         . . . . . . . . . . . . . . . . . . . . . . . .    9
        2.2.1   Dominant genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .        9
        2.2.2   Recessive genes    . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   10
  2.3   Reliability theory models . . . . . . . . . . . . . . . . . . . . . . . . . . . .      12
        2.3.1   Probabilistic landscape . . . . . . . . . . . . . . . . . . . . . . . . .      12
        2.3.2   Evolving landscape . . . . . . . . . . . . . . . . . . . . . . . . . . .       13

3 Results                                                                                      16
  3.1   Accumulation of mutations across gene families         . . . . . . . . . . . . . . .   16
  3.2   Ploidy and genetic instability in cancer evolution       . . . . . . . . . . . . . .   18
  3.3   Optimal instability and ploidy levels in evolving tumors . . . . . . . . . . .         20

4 Discussion                                                                                   23
  4.1   Accumulation of mutations . . . . . . . . . . . . . . . . . . . . . . . . . . .        23
  4.2   Reliability in 1 division . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    23
  4.3   Reliability in evolving tumors     . . . . . . . . . . . . . . . . . . . . . . . . .   24
  4.4   Conclusions    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   26

Bibliography                                                                                   27
Supporting information                                                                         31
  S.1   Parameter estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       31
  S.2   Recessive mutation accumulation model          . . . . . . . . . . . . . . . . . . .   31
  S.3   Probabilistic landscape . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      32
        S.3.1   Probabilistic landscape denition      . . . . . . . . . . . . . . . . . . .   32
        S.3.2   Optimal mutation rate . . . . . . . . . . . . . . . . . . . . . . . . .        33
        S.3.3   Optimal ploidy     . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   35
  S.4   Evolving landscape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       36
        S.4.1   Optimal mutation rate . . . . . . . . . . . . . . . . . . . . . . . . .        36
        S.4.2   Optimal ploidy     . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   38
List of Figures
  1    The hallmarks of cancer     . . . . . . . . . . . . . . . . . . . . . . . . . . . .    1
  2    Frequency of mutations in recurrently mutated cancer genes          . . . . . . . .   2
  3    Representations of CIN events in diploid cells with 1 chromosome          . . . . .    4
  4    M-FISH karyotypes of a normal and cancerous cell . . . . . . . . . . . . . .           5
  5    Genome model for dierent ploidies . . . . . . . . . . . . . . . . . . . . . .         9
  6    Reaction-like schemes of the recessive mutation model for dierent ploidies           11
  7    Evolution in the accumulation of activated oncogenes/inactivated house-
       keeping genes   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   17
  8    Visual representation of the oncogenic probability landscape        . . . . . . . .   18
  9    View of the probabilistic landscape for diploid and MMR-decient cells          . .   19
  10   Optimal instability and ploidy levels in evolving tumors . . . . . . . . . . .        21
  11   Evolution of the research line presented in this work . . . . . . . . . . . . .       25
List of Tables
  S1   Paramater estimation for the presented models . . . . . . . . . . . . . . . .   31
1     Introduction
1.1    Cancer, a disease of the genome

Since the discovery of cancer as a genetic disease, researchers have been trying to under-
stand the mechanisms that act on cancer evolution. Nowadays, cancer is understood as a
disease driven by complex phenotypic alterations on rogue cells that deploy unicellular-like
replication characteristics and escape multicellular control [7].   These phenotypic alter-
ations, evolved through the pressure of ne-tuned selective pressures, arise after genome
alterations modifying single-nucleotide sequences, chromosomal conguration or epige-
netic processes [8, 9, 10]. The deregulation of signal transduction pathways allows cancer
cells to gain the Hallmarks of cancer, which are composed of 10 basic functions for cancer
cells to survive and prosper that go from enabling immortality to producing angiogenesis
(Figure 1) [4]. This means that in cancer, normal healthy cells, through the mechanisms
of evolution, acquire phenotypes that allow them to proliferate and invade tissues in an
uncontrolled manner. But, how are these benecial phenotypes acquired?

Figure 1:   The hallmarks of cancer: A group of 10 fundamental functions that cancer
cells gain due to the accumulation of changes on the cell's genome at the gene and the
karyotype level. Image edited from Hanahan and Weinberg's work [4].

                                                                                          1
1.1.1 Evolutionary footprints in the cancer genome
As stated in the prologue, cancer is a heterogeneous disease characterized by very few uni-
versal events that are shared across patients and tumor sites (Figure 2) [2]. Such cancer
heterogeneity goes from the variability resulting from dierent tissues and environmen-
tal constraints to intratumoral heterogeneity (ITH) itself.   For most cancers, there are
thousands of genetic/epigenetic changes that contribute to the generation of carcinogen-
esis, yet only three genes are mutated more than 10% across patients (TP53,        PIK3CA,
BRAF ) [2]. In addition, most mutations found in tumor subclones are neutral or mildly
deleterious passengers, meaning that they do not confer any apparent cancerous capacity,
and only a handful of mutations appear to be positively selected (drivers) [11].

This should imply that the tness landscapes underlying cancer evolution are very at and
corrugated, with slight changes in the cells' tness deciding the fate of cancer evolution
[8, 11]. Amid such a heterogeneous and vast mutational landscape, there is a need to look
beyond single mutations to nd universal patterns characterizing evolution [12]. To do so,
we need to understand the dynamics of cancer instability at two main levels of genome
organization, genes and chromosomes.

Figure 2:    Frequency of mutations in recurrently mutated cancer genes in dierent tumor
types. As can be seen, the only very commonly mutated gene across several tumor types
is   TP53. Image taken and adapted from Martincorena et al.'s work on the mutational
landscape of cancer [2].

                                                                                         2
1.1.2 Microsatellite instability
Microsatellite instability (MIN) can be dened as changes in specic loci in the genome.
One example of this phenomenon is single nucleotide polymorphisms (SNP), which change
the base pair of a given locus in the genome.    Although these types of mutations have
also been found to occur in healthy cells, MIN in cancer cells is produced by acquiring
the hypermutator phenotype. This phenotype, characterized by extremely high levels of
genetic errors, is caused by defects in mismatch repair mechanisms (MMR) responsible for
DNA replication delity at the cellular division [13, 14]. Biallelic loss of any MMR-related
genes (MLH1,    MSH2 ) can fuel microsatellite instability by increasing the mutation rate
(hypermutator phenotype) [14].

This higher mutation rate increases the acquisition frequency of driver mutations, which
are located on oncogenes and tumor suppressor genes, but also increases the acquisition
rate of passenger and disadvantageous mutations [13]. This implies an evolutionary ten-
sion, early captured by the error threshold hypothesis of the Quasispecies Theory, between
mutating enough to evolve, but not too much to avoid deleteriousness [15]. In cancer, a
similar mechanism could explain how tumors survive and progress with highly unstable
genomes [16].

Three main gene families are relevant in the microsatellite instability picture of cancer,
namely Oncogenes, Tumor Suppressor genes and Housekeeping genes. Oncogenes (OG)
are genes that promote cell growth and division. By mutating or overexpressing this type
of gene, cancer cells can increase their proliferation rate, thus increasing their tness.
Evolution in oncogenes works typically by dominant gain-of-function mutations [17, 18,
19]. This means that oncogenes only need one mutated copy of the gene to generate an
increased tness on the cell. Examples of this behavior are found on the oncogenes   EGFR
and   K-RAS, among others [19, 20]. Typically, oncogenes are aected by mutations at
certain loci that normally increase either the protein functionality or expression.    This
means that mutations in oncogenes will be typically missense mutations, insertions, in-
frame deletions or amplication of the gene [8, 19, 21].

Tumor suppressor genes (TSG) can be classied into two groups, the gatekeepers and the
caretakers [22]. On the one hand, the gatekeepers are responsible for the regulation of the
cell cycle [14]. If mutated, gatekeepers will lose their ability to regulate cell growth by
stopping the progression of the cell cycle or inducing apoptosis under certain conditions,
such as an overexpressed oncogene [22, 23]. Due to this, cells with mutated gatekeeper
TSG are thought to have increased tness.      On the other hand, the caretakers are re-
sponsible for maintaining the genome integrity [14]. If mutated, genome instability can
appear. Depending on the type of genome instability they fuel, caretakers can be further
divided into MIN-inducing (MLH1,     MSH2 ) and CIN-inducing (WRN, ATM ) [14]. Mu-
tations on caretakers do not increase the cell's tness by itself, as they only increase the
rate at which genomic alterations are produced.

Typically, evolution in TSG works by the Knudson's two-hit hypothesis, which states
that both alleles of a TSG need to be mutated/deleted in order to deactivate the gene [9,
17]. TSG are typically altered through protein-truncating mutations in all their length,
missense mutations in critical regions such as the 5' or 3' splice sites or deletions of the
gene [8, 9].

                                                                                          3
Housekeeping genes (HKG) are those genes that maintain fundamental metabolic func-
tions of the cell and provide support through the cell cycle [24]. This type of genes are
expected to have an unchanged expression level through dierent cells and tissues [24, 25].
Due to their function in the cell cycle, some oncogenes and TSG have been classied as
housekeeping genes in some studies [24]. This means that, for cancer cells, housekeeping
genes are only those that maintain the most basic cellular functions.

Mutations in housekeeping genes are considered to be deleterious, as they aect funda-
mental cell functions that cannot be changed, even in cancer cells [26].    As with TSG,
mutations in housekeeping genes work in a recessive loss of function manner, meaning
that all copies of a housekeeping gene need to be mutated in order to decrease the cell's
tness [27].

1.1.3 Chromosomal instability
Another signature of genome evolution in cancer is chromosomal instability (CIN), char-
acterized by the accumulation of chromosomal alterations such as translocations or chro-
mosomal loss/gain [10]. As each CIN event transforms a large set of the genome, CIN
is thought to enable the exploration of a phenotypic landscape in a way that cannot be
achieved with an increased mutation rate alone (microsatellite instability) [10, 28]. It is
thought that in order for cells with CIN to appear, CIN-tolerance genes (TP53,     BRCA1 )
need to be mutated [10, 29].      This allows the survival of CIN-positive cells that have
been through events like missegregations of chromosomes, which produce cells that gain
or lose one chromosome pair, translocation events, which produce cells with new mixed
chromosomes, and whole genome doubling (WGD) events (Figures 3,4) [10].

                                             (a)

                                             (b)

Figure 3:      Representations of CIN events in diploid cells with 1 chromosome.   (a) Chro-
mosome missegregations generate cells that have gained and lost 1 chromosome copy [10].
(b) Cytokinesis failures generate cells with a duplicated karyotype (WGD) [10].

                                                                                          4
One of the best-known examples of how CIN enables the evolution of cancer in a way that
is very dicult to predict is the Philadelphia chromosome [30]. This translocation event,
known to produce chronic myeloid leukemia (CML), results from the fusion of chromo-
somes 9 and 22. This translocation produces a fusion gene (BCR-ABL1 ), which allows
cells to divide in an uncontrolled manner [31].    From this example, it is clear that the
genome instability produce by CIN is very dierent from the one produced by MIN due
to chromosomal instability enabling the exploration of a cancer landscape via approaches
that cannot be performed by MMR-decient cells.

Although the eects of chromosomal instability in the karyotype of cancer cells had been
discovered just before the rst world war, the perceived importance of it in cancer evo-
lution fell in the last half of the 20th century with the discovery of oncogenes and later
tumor suppressor genes [32]. These promised the idea of the discovery of 5-6 common can-
cer genes as the ones responsible for the stepwise development of cancer [12]. Nowadays,
gene-centric theories for cancer evolution are slowly facing new genome theories where
the karyotype is seen as responsible for the organization of gene interactions in a species
(Karyotype coding) [12, 33].

                     (a)                                              (b)

Figure 4:   M-FISH karyotypes of a normal       (a)   and cancerous   (b)   cell.   The eect of
chromosomal missegregations can have a big impact on cancer evolution. However, due
to the complexity of the genome, the precise role of CIN in cancer evolution is yet not
totally understood. Image edited from Duesberg et al. [33].

1.1.4 Whole genome doubling
Whole Genome doubling (WGD) is the phenomenon by which typically a cellular division
event fails once the entire DNA content has already been duplicated, thus producing a
cell with a chromosome content of ploidy four [10]. Although cytokinesis failure seems to
be the most common cause of WGD, other events such as the rereplication of DNA can
cause it. In addition, cells with a G1 arrest defect (generally due to   TP53 knock-out) are
thought to have higher odds of having WGD and surviving [5, 34].

                                                                                              5
Current data indicates that WGD events correlate with evidence of CIN tumors harboring
an average ploidy number of 3.3 [5, 28]. Furthermore, for WGD to be so common across
cancer types, there must be an evolutionary advantage that allows this event to xate in
the population [5]. It is thought that WGD is a mechanism used by cancer cells to mit-
igate the Muller's ratchet. In this process, asexual populations (like cancer) accumulate
deleterious mutations through time due to a lack of recombination [27, 35]. By doubling
the karyotype (WGD), the cell could be protecting the essential parts of the genome that
should not be mutated (Housekeeping genes). Via having more copies of each gene, the
probability that all the copies of a single gene are mutated is lower. In evolutionary terms,
this means that cancer cells that have not doubled their genome (WGD) will have a higher
risk of having housekeeping genes with all of its copies mutated, thus increasing the odds
of reducing the cancer cell's tness.

Even if segregation errors and structural aberrations on a per chromosome basis do not in-
crease in WGD-positive cells (tetraploids) with respect to WGD-negative cells (diploids),
tetraploids seem to have a greater tolerance for chromosomal segregation errors relative
to diploids, thus explaining the link between CIN and WGD [34]. On top of that, there is
evidence that CIN and WGD are mutually exclusive with microsatellite instability (MIN)
due to the presence of both events is thought to be deleterious [5]. Thus, cancer types
whose cells have quiet genomes but are MIN-driven will have a lower WGD frequency
than those that are microsatellite stable.    This is especially present in some colorectal
cancers, where the two dierent evolutionary pathways can be clearly distinguished [5].

Although still remaining an active area of research, it is thought that there is a balanc-
ing force in WGD, which produces that cells with a higher ploidy grow slower in early
stages, even if this dierence disappears in later stages [34]. This implies that cancer cells
that have experienced WGD will not have many chromosome copies because that will be
counterproductive in evolutionary terms.

1.2    State of art

1.2.1 Whole genome doubling models
Even though WGD has not been studied thoroughly until the last decade, several models
that give insights on still unresolved aspects of WGD have already been proposed.

In the work of Gusev et al., a simple model of CIN that only included chromosomal
segregations produced cells that were around the optimal quasi-triploid state [36]. Nev-
ertheless, if cells with 3.3N represent an optimal evolution on WGD-positive cells, these
results should depend on the location on the genome of cancer genes. In Laughney et al.'s
model, the eects of WGD in cancer genes are simplied into a pro- or anti-proliferative
tendency based on the OG and TSG that a chromosome arm has [28].               Although this
model also managed to capture cells around the optimal quasi-triploid state, both results
depended on implying a solid upper limit of 8 chromosomal copies, beyond which cells
would not sustain a massive karyotype. The lack of experimental evidence for this value
implies that the ndings of the optimal ploidy might need further research [28].

                                                                                            6
The only existing model that combines whole genome doubling and the hypothetical ben-
ecial eect of WGD on the accumulation of deleterious mutations can be found in Lopez
et al.'s work [27].   Here WGD is modeled as an event that reduces the tness cost of
passenger mutations, but it is itself associated with a tness cost. In this model, the fate
of WGD selection depends on the balance between the benets (reduced passenger tness
cost) and disadvantages (WGD tness cost) of WGD. This means that, for example, in
this model, a higher deleterious mutation rate translated into an increased WGD selec-
tion. These results are based on currently unknown variables such as the WGD tness
cost, implying that the parameter complexity of the model might hamper specic results
[34].

All in all, models that combine the dierent eects of WGD on the three primary cancer
genes (Oncogenes, TSG and Housekeeping genes) have not yet been developed. As WGD
is an evolutionary pathway that appears to be selected across dierent tumor growth
circumstances, there is a need to include, in WGD models, the agents that drive cancer
evolution in all its form. In the present work, a minimal framework able to capture the
eect of WGD on both cancer evolution and cellular viability mutations is described.

1.2.2 Negative selection models
As has been already stated, it is thought that WGD appears to mitigate the decay of the
cancer population resulting from the irreversible accumulation of deleterious mutations,
the so-called Muller's ratchet [27]. This phenomenon has already been studied in several
models that explain under which conditions the ratcheting appears.

One of the simplest models for a balance between driver and deleterious mutations can
be found in Nowak et al.'s work on reliability theory in the genome of HIV [37].       This
model introduces the concept of an optimal mutation rate that balances the need to mu-
tate escape mutant-generating genes while keeping necessary genes in place. With this
approach, the critical value only depends on the structure of the genome, meaning that
the model does not consider an underlying evolutionary landscape that has proven to be
often too complex to dene [38].

If the eect of a landscape is introduced, the maximum mutation rate to maintain a t
population from decaying can be retrieved from the error threshold model [39]. The results
point out that given a sucient tness decrease due to housekeeping gene mutations, the
t population of cells can survive, even if the mutation rate is increased. Similar results
arose from McFarland et al.'s work, where cancer cell populations with mutating driver
and passenger genes were simulated [26]. In this case, the results indicated that mildly
deleterious mutations slowed cancer progression more than the highly deleterious ones, as
the latter were quickly eliminated by natural selection.

                                                                                          7
1.3    Scope of the project

The main objective of the present project is to develop a mathematical framework for the
role of WGD in cancer evolution with the aim of understanding the pervasiveness of WGD
as an early phenomenon in cancer progression. From this, two major questions arise, the
eect of WGD as a possible actor mitigating the Muller's ratchet, and the relation between
ploidy and mutational rate governing cancer evolution. Due to whole genome doubling
being still an unresolved aspect of tumoral evolution, mathematical models of WGD could
also answer important questions not yet solved, such as the underlying mechanism linking
WGD to cells with a ploidy of 3.3N. Hence, we hypothesize that the possible selective
advantage of WGD is likely to result from a balance between the mutational signatures
across the three principal cancer gene families, namely oncogenes, tumor suppressors and
housekeeping genes.   Furthermore, we expect to nd a non-trivial correlation between
ploidy values and acceptable instability levels, for which WGD might allow for increased
tumor mutational loads.

                                                                                        8
2     Methods
2.1    Genome model

Having dened cancer as a disease of the genome, a simplication of the human genome
is proposed in order to develop a minimal framework able to capture the roles of WGD
and instability in cancer evolution.    Although the human genome comprises 23 pairs
of chromosomes, the model starts by dening cancer cells as having a single genetic
compartment, acting as a single chromosome with all the genes that aect cancer evolution
(Oncogenes, TSG and Housekeeping genes) in place. Having dened our genome, we ask
ourselves here how does the genome structure correlate with the mutational behavior of
each gene type? As seen in gure 5, as mutated alleles in oncogenes are dominant, these
genes are represented in series, meaning that a single genetic hit will change the encoded
proteins, thus producing a tness change [17, 18, 19]. In contrast, tumor suppressor genes
and housekeeping genes are represented in parallel, as their mutated alleles are recessive
[9, 17, 27]. As for parallel circuits, mutations in all gene copies will be needed to disrupt
their function.

                  (a)                                          (b)

                   (c)                                         (d)

Figure 5:   Genome model for dierent ploidies with housekeeping genes (HKG), tumor
suppressor genes (TSG) and oncogenes (OG).        (a)   Haploids   (b)   Triploids   (c)   Diploids
(d) Tetraploids.    For ploidies higher than 1, a single OG works as a system in series. In
contrast, a single TSG or HKG can be understood as a system in parallel.

2.2    Mutation accumulation models

2.2.1 Dominant genes
The main objective of this model is to describe, at a single-cell level, how genome ploidy
(here, the number of gene copies) aects the mutational processes underlying cancer
progression. Therefore, a central focus of the model is the study of the accumulation of
mutations in genes that work in a dominant gain-of-function manner (Oncogenes) [17, 18].

                                                                                                 9
In oncogenes, there is an increase in the cell's tness only with the mutation of one copy
of the gene [19]. Oncogenes can be either in the 'OFF' state with no mutations across
all copies or in the 'ON' state with one or more copies of the gene mutated. This means
that our scheme will only consider the mutation of one of the copies, greatly simplifying
the model.

To construct a general ordinary dierential equation (ODE) that describes the kinetics of
mutated oncogenes in a cell with ploidy   φ,   several considerations need to be made. First,
to simplify the equation, the probability of mutating more than one copy of a gene in
                                                   n
a single generation is considered very small (µ  µ , n ≥ 2). Then, a general ODE for
dominant genes can be drawn by taking into account        µ   as the probability to mutate one
of the copies of a gene in one generation,   NOG   as the total number of oncogenes and          Nφ
as the number of activated oncogenes in a cell with ploidy       φ   (Equation 1).

                                   dNφ
                                       = φµ(NOG − Nφ )                                           (1)
                                    dt

2.2.2 Recessive genes
The main objective of this model is to describe, also in a single-cell, how genome ploidy
aects the accumulation of mutations in recessive genes (TSG and HKG) [9, 17].

Focusing on the housekeeping gene family, we aim at understanding how all-mutated HKG
accumulate, thus reducing cellular viability [26, 27]. A symmetric approach will suce
to understand TSG mutation accumulation, as both gene families have mainly loss-of-
function recessive mutations in cancer [17, 27]. Taking into account that the probability
of mutating a copy of a gene per generation (µ) needs to be multiplied by the number of
copies of the gene that are left unmutated, and considering that in one generation more
                                                  n
than one copy of a single gene can be mutated (µ with n > 1), a reaction-like scheme
for the probabilities of mutating one or more gene copies per generation can be drawn
(Figure 6).

In order to simplify the model, and to be able to retrieve a general expression for the ac-
cumulation of genes with all copies mutated (which would induce the loss of the function
of that gene) for a given ploidy φ, the model assumes that only one copy of a gene can be
                            n                         n
mutated per generation (µ ≈ 0, n ≥ 2 due to µ  µ , n ≥ 2). Taking this into account,
from gure 6, four systems of ordinary dierential equations can be computed (Equations
4-7, equations S.1-S.6).

Each system has a corresponding ODE for the dynamics of the number of copies that can
be mutated for a gene (NiM = number of genes with         i   copies mutated for     i = 1, 2, 3, 4).
                                                                                        dN1M
This means that the law of mass-action, which states that the rate of the reactions (        )
                                                                                         dt
is directly proportional to the concentrations of the reactants (N0M ) and the reaction prob-
ability (Xµ), can be used [40].

                                                                                                  10
(a)                                             (b)

                (c)                                             (d)

Figure 6: Reaction-like schemes of the recessive mutation       model for dierent ploidies.
(a) Haploids (b) Triploids (c) Diploids (d) Tetraploids

In a cell with ploidy   φ,   the system will be formed by   φ   dierent ODEs.    Using the
reaction-like scheme above (Figure 6), and dening   NiM    as the number of genes with    i
copies mutated, the system of ODEs for a cell with ploidy       φ   can be dened (Equations
2,3).

                                  dNiM
                                       = −φµNiM , i = 0                                  (2)
                                   dt

                 dNiM
                      = (φ − i + 1)µN(i−1)M − (φ − i)µNiM , i = [1, φ)                   (3)
                  dt

Using equations 2 and 3, for example, the accumulation of mutations in recessive genes in
cells with ploidy four would be characterized by the system of ODEs formed by equations
4 to 7.

                                     dN0M
                                          = −4µN0M                                       (4)
                                      dt

                                 dN1M
                                      = 4µN0M − 3µN1M                                    (5)
                                  dt

                                 dN2M
                                      = 3µN1M − 2µN2M                                    (6)
                                  dt

                                  dN3M
                                       = 2µN2M − µN3M                                    (7)
                                   dt

                                                                                         11
By solving a system of ODEs (Equations 4-7), an equation that denes the accumulation
of recessive genes with all copies mutated can be retrieved (See results, equations 19-22).

2.3      Reliability theory models

One of the diculties in modeling cancer evolution is that the shape of the landscape in
cancer is at and corrugated, with tness changes due to mutations in cancer genes being
very variable and small [8, 11]. Reliability theory (RT), rst developed to design systems
that minimize breakdowns in operation (Electronic circuits, aircraft design), allows us
to look at cancer evolution from a novel perspective [41].               In RT, the cells' genome is
considered as a system with components (the genes) that can fail (mutate) [7, 37]. With
this alternative approach, the RT framework can be used to focus on determining the
probability for a cell to become cancerous or the implied oncogenic risk of a specic
genome conguration, among other questions. In addition, RT allows for the generation
of models with reduced parameter complexity due to its focus on the eect of systems'
structure on their viability. This allows for the generation of mathematical frameworks in
cancer that do not depend on an underlying evolutionary landscape that has been dicult
to dene [38].

2.3.1 Probabilistic landscape
Using reliability theory, qualitative landscapes for cancer and HIV have already been
developed [7, 37]. Although realistic landscapes for cancer have not yet been dened, a
conceptual and probabilistic landscape can still give insight into the development and role
of WGD. Thus, we here propose a RT approach to the previously dened genome.

First, mutations on oncogenes (OG) are considered to produce a linear increase in the
replication rate of cells [7]. This is a suitable assumption if only a few genes are mutated,
as the diminishing return epistasis eect can be discarded (tness increase gets smaller
as the tness of the cell increases) [42].          In this scenario, the model assumes a linear
accumulation of mutated oncogenes without saturation (linearisation of equation 18).
Under these assumptions, the expression             r(µ)   can be dened, taking into account that
r   is the replication rate,   r0   is the initial replication rate,   δOG   is the replication rate gain
per oncogene,    NOG   is the total number of oncogenes,        φ   is the ploidy (number of copies of
a single gene) and     µ   is the mutation rate per gene copy per generation (Equation 8, see
Solé et al. for a more detailed description [7]).

                                           r = r0 + δOG NOG φµ                                       (8)

Another important type of driver in tumor evolution that has not yet been considered in
previous RT approaches to cancer are tumor suppressor genes (TSG). From this group,
only the gatekeepers directly aect the replication rate, as they are responsible for regulat-
ing the cell cycle [14, 22]. Knock-outs in gatekeeper genes allow cells to bypass restriction
point controls. In addition, mutation of TSG is key in cancer development, as under onco-
genic stress, some TSG activate pathways to inhibit cell growth or produce senescence [23].

                                                                                                      12
So, in order for cells with mutations in oncogenes (Equation 8) to survive, at least one
relevant TSG needs to be mutated. An example of this behavior could be found on              TP53
mutation, which is thought to serve as a signicant event that is commonly found on
a wide variety of cancers [2].       To introduce this into our landscape, the probability of
mutating one or more TSG, with          NT SG   as the total number of TSG, is dened (For the
complete development of equation 9 see equations S.7-S.11).

                                                           NT SG
                                     PT SG = 1 − 1 − µφ                                        (9)

Finally, there is a need to include the housekeeping genes (HKG) into our nal landscape.
To do that, the probability of not mutating any housekeeping gene, with         NHK    as the total
number of housekeeping genes, is dened (For the complete development of equation 10,
see equations S.7-S.9). Even though most mutations on housekeeping genes are known
to have mild eects on cellular tness, as proven by their accumulation as passengers of
evolution, we here assume that no HKG mutations are desired in the previously mentioned
cancerous genome conguration (Figure 5) [26, 27].

                                                         NHK
                                        PHK = 1 − µφ                                          (10)

By combining the three expressions (Equation 8-10), we can compute a probabilistic
expression for cellular replication (Equation 11), namely the tness increase of OG mu-
tations, provided that at least one TSG has been mutated, and no HKG have been lost.

                                                   N         N
                    r = (r0 + δOG NOG φµ) 1 − 1 − µφ T SG 1 − µφ HK                           (11)

2.3.2 Evolving landscape
In the previous section, we have introduced a general probabilistic landscape for cancer
evolution that follows previous work on optimal instability levels on cancer and viral evolu-
tion [7, 37]. However, the landscape describes the probability of developing a cancer-prone
genome in a single generation, since      µ   is the gene mutation rate per cell division. In this
context, a further iteration of the landscape that considers the eect of mutation accumu-
lation in time is presented. In addition, we here focus on the two relevant compartments
modulating mutation rate, namely TSG and HKG, as the oncogene eect (Equation 8)
induces simply a linear increase in replication with negligible eect for optimal genome
reliability.

To include time on the landscape, the probability of not mutating a single copy of a gene
in a certain time span needs to be dened, from now named as the reliability of a gene
copy. From Bazovsky's work on reliability theory, the reliability of a single component
(gene copy)    (Rc (t))   with a constant chance failure rate (mutation rate,   µ)   can be drawn
(Equation 12) [41].

                                                                                                13
Rc (t) = e−µt                                           (12)

From the denition of unreliability (probability of mutating in a specic time span), the
unreliability of a gene copy is dened as   1 − Rc .   If the mutated alleles are recessive (TSG
and HKG), all copies of a gene need to be mutated to generate a change in the tness [9,
17, 27]. Therefore, a single gene is a parallel system where the probability of mutating all
the copies of a gene in a certain time period (unreliability of the gene, here       Qg (t))   is the
product of the unreliability of all of its copies alone (Figure 5, equation 13).

                               Qg (t) = (Qc (t))φ = (1 − e−µt )φ                                (13)

Then, considering that the reliability of a gene is dened as        1 − Qg ,   the probability of
not mutating all the copies of any housekeeping gene in a specic time (reliability of
the housekeeping gene compartment, here        RHK )     is the product of the reliability of all
housekeeping genes alone (Equation 14). This means that the genes by themselves form a
system in series where the mutation of one gene leads to the decay of the system (Figure
5). Thus, equation 14 represents a case where all the HKG have the same eect on the
landscape.

                                                 φ NHK
                               RHK = 1 − 1 − e−µt                                               (14)

Finally, tumor suppressor genes are taken into account by dening the unreliability of
all the TSG as   1 − RT SG   (Equation 15).    Here the unreliability can be dened as the
probability of mutating one or more TSG with all its copies.            As has been previously
stated, mutation of TSG (for example,       TP53 ) is a critical process in carcinogenesis, as
mutated TSG are not able to properly activate senescence or/and inhibited cell growth
pathways under oncogenic stress [23].

                                                     φ NT SG
                             QT SG = 1 − 1 − 1 − e−µt                                           (15)

By combining equations 14 and 15, the nal expression for the evolving landscape is
formed (Equation 16). This simplied expression allows to retrieve the optimal mutation
rate and ploidy in a simpler manner. For complete infromation on the derivation of the
optimal mutation rate and ploidy, see equations S.28-S.43.

                                         
                                  NT SG 
                               −µt φ
                                                          φ NHK
                 P = 1− 1− 1−e               1 − 1 − e−µt                                       (16)

                                                                                                  14
It can be seen how, in the evolving probability landscape (Equation 16), the expression
q(t, µ) = (1 − e−µt ) substitutes the mutation rate (µ) in the nal probabilistic landscape
equation (11). This is because the probability of mutating all copies of a gene in a certain
time span   t   is precisely the fraction of mutated recessive genes at time   t,   dened in the
mutation accumulation models (Equation 23).            To take this relationship into account,
and to be able to model the evolution of the replication rate, a modied version of the
oncogene expression can be added to the evolving landscape (Equation 17).

                                                               
                              −φµt
                                                        NT SG 
                                                     −µt φ
                                                                                φ NHK
                                                                   1 − 1 − e−µt
                                     
  r = r0 + δOG NOG 1 − e                   1− 1− 1−e
                                                                                             (17)

                                                                                              15
3     Results
3.1    Accumulation of mutations across gene families

On the one hand, in dominant genes, by solving a single general ODE (Equation 1), an
expression that denes the number of activated oncogenes found on a single cell after         t
generations can be drawn (Equation 18).

                                  Nφ = NOG 1 − e−φµt
                                                          
                                                                                          (18)

This equation represents the eect of the ploidy on the accumulation of activated onco-
genes. As seen above, this eect is linear, thus allowing for cells with higher ploidies to
more rapidly activate multiple oncogenes.

On the other hand, in recessive genes (TSG and HKG), multiple systems of ODEs (Equa-
tions 2,3) need to be solved. Considering   NHK   as the total number of housekeeping genes,
four expressions that represent the number of genes with all copies mutated in a cell with
ploidies from 1 to 4 can be dened (Equations 19-22).

                                 N1M = NHK 1 − e−µt
                                                          
                                                                                          (19)

                             N2M = NHK 1 − 2e−µt + e−2µt
                                                               
                                                                                          (20)

                        N3M = NHK 1 − 3e−µt + 3e−2µt − e−3µt
                                                                   
                                                                                          (21)

                    N4M = NHK 1 − 4e−µt + 6e−2µt − 4e−3µt + e−4µt
                                                                           
                                                                                          (22)

From equations 19 to 22, a general equation for the accumulation of genes with all copies
mutated in cells with a given ploidy   φ can be retrieved (Equation 23).   This expression rep-
resents a dynamical hint on the strong, non-linear eect of the ploidy on the inactivation
of recessive genes (TSG and HKG).

                                                          φ
                                 NφM = NHK 1 − e−µt                                       (23)

                                                                                            16
To assess the eect of the ploidy in cancer genes (recessive and dominant), simulations
of the models of the accumulation of mutations in dominant and recessive genes were
performed (Equations 18,23, gure 7). As both mutations in TSG and HKG are reces-
sive, only the accumulation of housekeeping genes with all copies mutated was simulated
(Figure 7b) [9, 17, 27]. The values for the parameters used in both simulations can be
found in the supporting information section (Table S1).

                      (a)                                         (b)

Figure 7: (a)     Evolution in the accumulation of oncogenes with one or more copies
mutated in a single cell with dierent ploidies.   (b)   Evolution in the accumulation of
housekeeping genes with all copies mutated in a single cell with dierent ploidies.

As expected, the number of copies of the genes (ploidy) aects very dierently genes
whose mutations in cancer are dominant than recessive (Figure 7). In dominant genes,
such as oncogenes, a higher ploidy increases the rate of gene activation, as more gene
copies generate a linear increase in the rate of genetic defect accumulation (Figure 7a).
In contrast, in recessive genes, such as TSG and HKG, a higher ploidy decreases the rate
at which these are inactivated, thus transforming the exponential curve seen in haploid
cells to an evermore sigmoidal curve seen in diploid to tetraploid cells by introducing a
delay in the accumulation of genes with all copies mutated (Figure 7b).

This is an interesting result, as it gives a preliminary account of how ploidy (and thus
chromosomal instability) alters the mutational dynamics of relevant carcinogenic gene
families.   On the one hand, the result implies that higher ploidies could be benecial,
as they linearly increase the rate of oncogene activation.    In addition, a higher ploidy
decreases the rate of housekeeping gene loss nonlinearly, maintaining cellular function in
place, and thus avoiding the Muller's ratchet from aecting the tumor's evolution [27, 35].
On the other hand, the same mechanism ensures that TSG are kept working properly,
ensuring a genome with lower oncogenic potential.     Consistent evidence points at TSG
inactivation by mutation being a rare event in WGD+ tumors [43]. This result could be
indicative of a possible optimal ploidy value in cancer evolution, able to protect HKG
while maintaining TSG mutable.

                                                                                        17
3.2      Ploidy and genetic instability in cancer evolution

Using reliability theory, a probabilistic landscape that allows to assess the intertwined
eect of ploidy     (φ)   and the mutation rate           (µ)   on a simplied genome was developed
(Equation 11). From this model, an optimal mutation rate for a given ploidy                     φ   can be
retrieved (Equation 24). In addition, an expression for the optimal ploidy is presented
(Equation 25). For complete information on the derivations, see equations S.12-S.27.

                                                                  1/φ
                                                         1
                                    µ=                                                                (24)
                                               NHK       + NT SG

                                                                    
                                                            1
                                               ln       NHK +NT SG
                                      φ=                                                              (25)
                                                        ln (µ)

Equations 24 and 25 represent the existence of an optimal instability level              µ for a cell with
ploidy   φ   and vice versa. This optimal mutation rate balances the evolutionary pressure
of not having any housekeeping gene with all copies mutated with the necessity of having
one or more tumor suppressor genes inactivated.

In order to get a global picture of the oncogenic probability landscape, equation 11 was
plotted over the mutation rate       (µ)   and the ploidy            (φ).   This represents a conceptual
tool to understand the intertwined role of ploidy and microsatellite instability in cancer
progression.     To observe the role of large karyotype congurations in cancer cells, the
ploidy range was extended for a possible decay of the replication rate at high ploidies.
Values for the parameters used to plot the landscape can be found in the supporting
information section (Table S1).

                          (a)                                                    (b)

Figure 8:      Visual representation of the oncogenic probability landscape associated with
the evolutionary dynamics of cancer cells (Equation 11).   (a) General view of the land-
scape, plotted against the mutation rate (µ) and the ploidy (φ). (b) Focus on the ploidies
seen typically in tumors.

                                                                                                       18
With the objective of easily displaying the overestimation on the optimal genome in-
stability and the underestimation on the ploidy value seen in the general probabilistic
landscape, two slices of the 3D plot seen in gure 8 are shown.             In the rst scenario,
the landscape is modeled for diploid cells (Figure 9a). Instead, in the second scenario,
the landscape is restrained for cells with maximum instability for the MMR phenotype
(µ = 10−4 ) (Figure 9b) [44, 45].

                        (a)                                     (b)

Figure 9: (a)      View of the probabilistic landscape for diploid cells plotted against the
mutation rate    (µ).   (b) View of the probabilistic landscape for cancer cells with maximum
instability for the MMR phenotype       (µ = 10−4 )   plotted against the ploidy   (φ).   Equations
shown in both plots represent the optimal mutation rate/ploidy where the cell has a
highest replication rate.

The results for the probability of developing a cancerous genome, where TSG are mutated
but HKG are kept in place, indicate that for a given ploidy        φ,   there is an optimal insta-
bility level that balances the capacity of mutating the OG and TSG compartments while
maintaining HKG in place (Figure 8).         However, when compared to real data, optimal
instability values seem signicantly higher than those of cancer cells with the mutator
phenotype [46, 47]. As an example, diploid MMR-decient tumors typically have a mu-
                           −5
tation rate around µ = 10 . Instead, our model predicts an optimal mutation rate of
µ = 2.4 · 10−2 (Equation 24, gure 9a) [46, 47].

At the same time, for a given instability level   µ,   the model seemingly underestimates the
ploidy that is needed to maintain genome integrity. The best example of this eect can be
found on diploid cells, whose instability limit is expected to correlate with experimental
values after MMR knockout [48]. However, in the present landscape, results indicate that
MMR-decient phenotypes would be able to survive even with ploidy               φ=1       (Equation
25, gure 9b).

In addition, from the results seen in gure 8a, there is no clear disadvantage on having
a very high ploidy, as the peak replication rate constantly increases with higher ploidies.
This could indicate that that experimentally observed ploidy values on cancer cells might
not result from the negative eect of the ploidy on the mutation of TSG [5, 28].

                                                                                                19
A relevant consideration here is that our results so far contemplate a probability landscape
that arises from a single division event (µ is dened as the probability of mutating one
copy of a gene in a single mutation).          However, as tumors progress and mutations in
HKG copies accumulate, it is likely that lower mutational levels are required to maintain
genome integrity.

3.3      Optimal instability and ploidy levels in evolving tumors

To understand the role of time, a reliability-like model able to account for how mutation
accumulation reshapes the optimal levels of instability and ploidy for carcinogenesis is
developed.     From the simplied evolving landscape, an optimal mutation rate and an
optimal ploidy value can be retrieved (Equations 26,27). For complete information on the
derivations, see equations S.28-S.43.

                                                                                   
                                1                                 1
                             µ = ln                                                                        (26)
                                                                                    
                                                                               1/φ 
                                t
                                                      
                                                              1
                                             1−           NHK +NT SG

                                                                          
                                                           1
                                             ln        NHK +NT SG
                                    φ=                                                                      (27)
                                               ln (1 − e−µt )

In addition, the mutation rate gain produced by an increased ploidy can be dened as the
fraction between the optimal mutation rate at ploidy                       φ    and the optimal mutation rate at
ploidy   φ−1   (Equations 28,29).

                                                                                        !
                                             1                         1
                                             t
                                               ln              
                                                                       1
                                                                                 1/φ
                              µφ                          1−       NHK +NT SG
                          k=      =                                                      !                  (28)
                             µφ−1
                                         1                             1
                                         t
                                           ln            
                                                                      1
                                                                                1/φ−1
                                                       1− N
                                                                   HK +NT SG

                                                                                !
                                                               1
                                    ln             
                                                           1
                                                                        1/φ
                                              1−       NHK +NT SG
                               k=                                                !                          (29)
                                                               1
                                    ln         
                                                          1
                                                                       1/φ−1
                                             1− N
                                                       HK +NT SG

As seen in equation 29, the mutation rate gain produced by an increased ploidy remains
constant with time. This could imply a mathematical hint on the reason why WGD ap-
pears to be an early phenomenon in cancer evolution [5, 27, 34].

                                                                                                             20
Using equation 26, the optimal mutation rate for haploids, diploids, triploids and tetraploids
was plotted over 10000 generations, which in cancer could be considered more than suf-
cient to kill the host (Figure 10a). Estimates of the number of generations needed for
a tumor to grow from a single cell are found around     t = 1000   [49, 50, 51]. As expected,
as cells progress and deleterious mutations in HKG accumulate, the admissible mutation
rate decreases to avoid excessive genome damage. In parallel, we compute, for a rogue
                                         −5
cell that has lost MMR function (µ = 10 ), the optimal ploidy for cancer progression
(Figure 10b) [46, 47].

Finally, gure 10c includes the decay of the optimal mutation rate gain dened by equation
29. In addition, taking into account the eect of the oncogenes on the evolving landscape
(Equation 17), simulations of cells with the usual instability level of the MMR phenotype
(µ = 10−5 ) were performed (Figure 10d) [46, 47].

                     (a)                                            (b)

                      (c)                                           (d)

Figure 10: (a)     Optimal mutation rate across 10000 generations of haploid, diploid,
triploid and tetraploid cells. (b) Optimal ploidy across 10000 generations of a cell with
a mutator phenotype (µ = 10 ). (c) Optimal mutation rate gain of a cell with ploidy φ
                               −5

from a cell of ploidy φ − 1. (d) Evolution of the replication rate of a diploid and a triploid
cell with MMR phenotype     (µ = 10−5 )   in time.

                                                                                           21
As seen in gure 10a, an increased ploidy allows for a higher mutation rate, which can
translate to a faster and more reliable evolutionary pattern, as mutations in oncogenes
can accumulate more rapidly, while housekeeping genes remain protected. Interestingly,
the benecial eect of having more gene copies on allowing high mutation rates decreases
for higher ploidies, possibly excluding high ploidies from being xed in the population
(Figure 10c). However, equation 26 shows that the optimal mutation rate of a cell at a
given number of generations    t   increases indenitely with higher ploidies.

Another relevant aspect of the results presented here can be understood by supposing
a cell with lost MMR machinery.         If the mutation rate is that of an MMR phenotype,
the haploid genome quickly becomes non-optimal, as single-copied housekeeping genes
rapidly accumulate mutations (Figure 10b).        In contrast, as mutations accumulate and
a higher ploidy is needed for HKG protection, the triploid genome becomes optimal at
around 10000 generations, which is much more than sucient for cancer to kill the host
[49, 50, 51]. Interestingly enough, the fact that diploid cells can survive (and even evolve)
under an MMR-decient phenotype for long timespans without losing viability indicates
a possible explanation for the pervasiveness of MSI-positive/WGD-negative cells across
cancer types [5].

If the evolving landscape with oncogenes (Equation 17) is simulated over time, it gener-
ates populations of cells with roughly 3 phases (Figure 10d). In the beginning, healthy
cells keep an initial low replication rate. Suddenly, when a TSG is mutated, replication
rates are prone to increase, as mutations in oncogenes are not restricted by apoptotic or
senescence-inducing pathways that are typically controlled by gatekeeper TSG [23]. This
is an interesting result from our model, as it indicates that tumoral cells can survive initial
mutagenesis without necessarily becoming unviable.

This is untrue, however, for longer timespans. When many generations have passed, the
eect of the Muller's ratchet appears due to an accumulation of housekeeping genes with
several copies mutated.   This increases the risk of having a housekeeping gene with all
functional copies mutated. This situation inevitably results in a decrease of the replica-
tion rate, thus producing a decline in population tness. Interestingly, as shown in gure
10d, ploidy has a powerful eect on the number of generations needed to change from an
evolving to a deleterious-driven cancer genome, consistent with recent evidence of WGD
as a mechanism to avoid the Muller's ratchet [27].

                                                                                            22
4     Discussion
The work presented here has tried to shed some light on the pervasiveness of whole genome
doubling in cancer by building a mathematical framework able to target previously un-
resolved questions regarding WGD. Although recent decades have seen an increase of
interest in WGD and aneuploidy in cancer, there is still a need to fully understand WGD,
from the evolutionary advantages that it carries to the functionality of the genome con-
gurations associated with it. In parallel, recent ndings seem to indicate that WGD+
cells may have unique genetic vulnerabilities that could be exploited therapeutically, thus
reassuring the importance of further research on WGD [52].

In this work, several models that examine WGD from the perspective of tness landscapes
and reliability theory have been developed. The models are intentionally simple, with the
aim of obtaining a general framework able to characterize universal patterns seen across
cancer types. In particular, a major question has focused on understanding the tension
arising from cancer cells relying on mutating TSG while maintaining HKG unmutated
and the role of ploidy in this trade-o.

4.1    Accumulation of mutations

Two critical results arise from the models centered on the accumulation of mutations in
cancer genes. On the one hand, the ploidy's   (φ)   eect on dominant cancer genes (OG) is
linear and weak (Equation 18), thus resulting in oncogenes not playing a signicant role in
the denition of the optimal instability level on the landscape models. On the other hand,
the ploidy eect on recessive cancer genes (TSG and HKG) can be dened as non-linear
and very strong, as the ploidy appears as an exponent in equation 23. This means that
high ploidies will strongly aect the inactivation of both TSG, which are needed for tumor
development, and HKG, which are considered to be deleterious. This result is consistent
with evidence suggesting that mutations in TSG after WGD typically do not aect all
copies of the gene, thus always leaving a wild-type allele that blocks TSG inactivation [43].

All in all, the model presented here provides an analytic demonstration of the crucial
role of ploidy as a mechanistic agent modulating recessive mutations in cancer evolution.
Furthermore, the clear dierence between the ploidy's eect on OG and TSG seen in
equations 18 and 23 could be considered in CIN models where the number of chromosome
copies is determined by how oncogenic or tumor-suppressive the chromosome is [28].

4.2    Reliability in 1 division

Inspired by previous reliability theory models that study the role of critical mutation rates
in HIV, a minimal mathematical framework that included the three main cancer gene fam-
ilies (OG, TSG and HK) and the ploidy has been constructed (Equation 11) [7, 37]. Using
reliability theory, we have captured the intertwined eects of microsatellite instability and
ploidy on cancer evolution in a minimal model, thus obtaining a rst analytical descrip-
tion of the complexity of microsatellite and chromosomal instability pathways in cancer.
At its simplest form, our probabilistic landscape can be understood by dening it as the
unreliability of the TSG compartment (probability of mutating one or more TSG) mul-
tiplied by the reliability of the HKG compartment (probability of not mutating any HKG).

                                                                                          23
You can also read