The chrysanthemum lavandulifolium genome and the molecular mechanism underlying diverse capitulum types - Oxford Academic

 
CONTINUE READING
The chrysanthemum lavandulifolium genome and the molecular mechanism underlying diverse capitulum types - Oxford Academic
Horticulture Research, 2022, 9: uhab022
                                                                                                                                   https://doi.org/10.1093/hr/uhab022

Article

The chrysanthemum lavandulifolium genome and the
molecular mechanism underlying diverse capitulum
types

                                                                                                                                                                                Downloaded from https://academic.oup.com/hr/article/doi/10.1093/hr/uhab022/6510191 by guest on 04 March 2022
Xiaohui Wen1 , 2 ,† , Junzhuo Li1 ,† , Lili Wang3 ,† , Chenfei Lu1 , Qiang Gao2 , Peng Xu4 , 5 , Ya Pu1 , Qiuling Zhang1 , Yan Hong1 , Luo Hong1 , He Huang1 , Huaigen Xin3 ,
Xiaoyun Wu1 , Dongru Kang6 , Kang Gao1 , Yajun Li1 , Chaofeng Ma1 , Xuming Li3 , Hongkun Zheng3 , Zicheng Wang6 , *, Yuannian Jiao4 , 5 , *, Liangsheng Zhang2 , * and
Silan Dai1 , *

1 BeijingKey Laboratory of Ornamental Plants Germplasm Innovation & Molecular Breeding, National Engineering Research Center for Floriculture, Beijing
Laboratory of Urban and Rural Ecological Environment, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants of the Ministry of
Education, School of Landscape Architecture, Beijing Forestry University, No. 35 East Qinghua Road, Beijing 100083, China
2 Genomics and Genetic Engineering Laboratory of Ornamental Plants, Department of Horticulture, College of Agriculture and Biotechnology, Zhejiang University,

No. 866 Yuhangtang Road, Hangzhou 310058, China
3 Biomarker Technologies Co., Ltd, No. 12 Fuqian Street, Shunyi District, Beijing 101300, China
4 State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, No. 20 Nanxincun, Beijing 100093, China
5 University of Chinese Academy of Sciences, No.19(A) Yuquan Road, Beijing 100049, China
6 State Key Laboratory of Crop Stress Adaptation and Improvement, Plant Germplasm Resources and Genetic Laboratory, Kaifeng Key Laboratory of

Chrysanthemum Biology, School of Life Sciences, School of Agriculture, Henan University, Jinming Road, Kaifeng 475004, China
*Corresponding authors. E-mail: wzc@henu.edu.cn; jiaoyn@ibcas.ac.cn; zls83@zju.edu.cn; silandai@sina.com
† These authors contributed equally

Abstract
Cultivated chrysanthemum (Chrysanthemum × morifolium Ramat.) is a beloved ornamental crop due to the diverse capitula types
among varieties, but the molecular mechanism of capitulum development remains unclear. Here, we report a 2.60 Gb chromosome-
scale reference genome of C. lavandulifolium, a wild Chrysanthemum species found in China, Korea and Japan. The evolutionary analysis
of the genome revealed that only recent tandem duplications occurred in the C. lavandulifolium genome after the shared whole genome
triplication (WGT) in Asteraceae. Based on the transcriptomic profiling of six important developmental stages of the radiate capitulum
in C. lavandulifolium, we found genes in the MADS-box, TCP, NAC and LOB gene families that were involved in disc and ray f loret
primordia differentiation. Notably, NAM and LOB30 homologs were specifically expressed in the radiate capitulum, suggesting their
pivotal roles in the genetic network of disc and ray f loret primordia differentiation in chrysanthemum. The present study not only
provides a high-quality reference genome of chrysanthemum but also provides insight into the molecular mechanism underlying the
diverse capitulum types in chrysanthemum.

Introduction                                                                             types, which restricts the utilization of rich flower type
Cultivated chrysanthemum (Chrysanthemum × morifolium                                     resources in chrysanthemum. To date, the genomes of 16
Ramat.) is a well-known ornamental crop showing very                                     Asteraceae species have been sequenced, which provides
diverse f lower morphologies. The f lower of a chrysanthe-                               insights into the Asteraceae genome [2]. However, most
mum is actually a capitulum that comprises inner disc                                    of these species have a relatively distant relationship
florets and peripheral ray f lorets. The f lower types of                                with chrysanthemum. The genome sequencing quality
chrysanthemum are determined by the morphology and                                       of some Asteraceae species was relatively low due to
relative numbers of disc and ray f lorets on a capitulum                                 the limitation of sequencing technology and the high
[1]. Understanding the molecular mechanism of disc                                       heterozygosity of Asteraceae (Supplementary Table 1).
and ray f loret differentiation under the same genetic                                   The genomes of C. seticuspe and C. nankingense, belonging
background will provide not only a foundation for                                        to the genus Chrysanthemum, have been sequenced by
the clarification of complex capitulum morphology in                                     the Illumina sequencing platform and Oxford Nanopore
chrysanthemum but also insight into the f loral devel-                                   long-read technology, respectively, without reaching
opment mechanism in Asteraceae. However, studies on                                      the chromosome level [3, 4]. Therefore, it is necessary
the mechanism of f lower type in chrysanthemum are                                       to obtain a high-quality chromosome-scale genome
hindered by the complex background of chrysanthe-                                        of the genus Chrysanthemum and study the origin of
mum, making it difficult to directionally breed for flower                               chrysanthemum at the whole-genome level.

Received: April 10, 2021; Accepted: September 17, 2021; Published: 20 January 2022
© The Author(s) 2022. Published by Oxford University Press on behalf of Nanjing Agricultural University. This is an Open Access article distributed under the terms
of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted reuse, distribution, and reproduction
in any medium, provided the original work is properly cited.
The chrysanthemum lavandulifolium genome and the molecular mechanism underlying diverse capitulum types - Oxford Academic
2   |   Horticulture Research, 2022, 9: uhab022

   Flower development is a complex process that involves      Table 1. Assembly summary for the C. lavandulifolium genome at
                                                              the chromosomal level
a complex gene regulatory network [5, 6]. The expression
                                                              Assembly statistics                               Size/Number
patterns of abundant genes could be detected during
flower development using transcriptomic and genomic           Contig number                                     10 136
sequencing technology. Studies on single f lowers have        Contig length (bp)                                2 669 472 274
                                                              Contig N50 (bp)                                   496 998
revealed that ABCE-class and CYC2-LIKE genes are
                                                              Contig N90 (bp)                                   136 070
involved in f loral organ identity and the regulation of      Average length (bp)                               15 002 629
floral symmetry [6, 7]. The development of next- and          Maximum length (bp)                               4 500 000
third-generation sequencing technologies enables the          Minimum length (bp)                               14 015
generation of high-quality genomes, and subsequently,         GC%                                               36.02
                                                              Repeat                                            66.15%
these reference genome data can provide more infor-
                                                              Number of complete genes                          64 257
mation to study the molecular mechanism of flower

                                                                                                                                Downloaded from https://academic.oup.com/hr/article/doi/10.1093/hr/uhab022/6510191 by guest on 04 March 2022
                                                              Scaffold number                                   178
development [8]. Previous studies showed that the             Scaffold N50 (bp)                                 300 401 778
expression patterns and copy numbers of these genes           Scaffold N90 (bp)                                 212 345 931
could regulate f loral organ morphology to modify flower      Largest scaffold (bp)                             322 278 347
                                                              Total scaffold size (bp)                          2 670 468 074
types [5, 7, 9]. Comparative genomic analysis can also
help to reveal the copy numbers and evolution of floral
development-related genes [10–12]. To date, several tran-
scriptome sequencing profiles have been carried out in C.     erozygosity) for C. lavandulifolium based on the results
lavandulifolium, C. nankingense and C. × morifolium “Jinba”   from the K-mer distribution (Supplementary Figure 1).
[3, 13]. However, the analysis of these transcriptomes        In total, 269.39 Gb (102.04 × coverage) and 193.88 Gb
lacked either a reference genome or the transcriptomic        (62.50 × coverage) of data were generated using the
profiles of disc and ray f loret differentiation stages.      PacBio RS II platform (Pacific Biosciences, Menlo Park,
Moreover, studies in Gerbera hybrida, Cosmos bipinnata        CA, USA) and Oxford Nanopore sequencing technologies
and Senecio vulgaris found additional f loral-related         (Oxford Nanopore Technologies Limited, Oxford Science
genes involved in capitulum development [14–16]. In           Park, Oxford, UK), respectively. These sequences were
conclusion, more transcriptomes and reference genomes         independently assembled by WTDBG v2.5 (https://gi
of chrysanthemum are needed to explore hub genes that         thub.com/ruanjue/wtdbg), and the two draft genomes
participate in the molecular mechanism underlying the         were merged and redundant reads were removed by
diverse capitula types.                                       Quickmerge v0.3.021 and purge haplotigs v1.0.422 , which
   The diploid species C. lavandulifolium (2n = 2x = 18) is   resulted in a 3.10 Gb genome (Supplementary Table 2).
often regarded as one of the ancestral species of chrysan-    With the aid of high-resolution chromosome confor-
themum [17, 18]. It is also used as a model plant to          mation capture (Hi-C) technology, 2.93 Gb (94.46%) of
study the diverse capitula types in chrysanthemum due         contigs were anchored onto the 9 chromosomes, forming
to its simple capitulum type, which possesses only one        pseudomolecules (Figure 1 and Supplementary Table 3).
round peripheral ray f loret and many round inner disc        After the removal of redundant and heterozygous
florets [19, 20]. Here, we sequenced and anchored 2.60 Gb     sequences by the heatmap signal of Hi-C, a 2.60 Gb
sequences to 9 pseudochromosomes of C. lavandulifolium.       chromosomal-level genome for C. lavandulifolium was
Phylogenetic analysis showed that C. lavandulifolium was      obtained, with a contig N50 of up to 497 kb (Figure 1a,
an important donor species for chrysanthemum [17].            Table 1 and Supplementary Figure 2). Compared with the
Transcriptomic profiles of different developmental            estimated C. lavandulifolium.
stages in radiate (C. lavandulifolium, Chrysanthemum             genome size by K-mer distribution analysis, 98.48%
indicum, Chrysanthemum vesticum, C × morifolium “28”,         of the C. lavandulifolium sequences were successfully
Erigeron brevisca, Helianthus annuus), discoid (Hippolytia    assembled and anchored onto the nine chromosomes
alashanensis and Helenium aromaticum) and ligulate            (Supplementary Table 3).
(Lactuca sativa and Taraxacum kok-saghyz) capitula               This C. lavandulifolium genome is the first chromosome-
were also analyzed to explore the hub genes involved          level genome in the genus Chrysanthemum. The protein-
in diverse capitulum development. Our study not only          coding genes were annotated by ab initio prediction,
provides a high-quality reference for the assembly of         homology and transcriptome-based approaches, and the
chrysanthemum genomes but also sheds light on the             results were then integrated by Evidence Modeler. A total
regulatory mechanism of capitulum development in              of 64 257 protein-coding genes were predicted (Supple-
Asteraceae.                                                   mentary Table 4), and the number of genes supported
                                                              by homology prediction and transcriptome prediction
                                                              was 53 714, accounting for 83.59% (Supplementary
Results                                                       Figure 3). Furthermore, 54 203 (84.35%) genes in the
Genome sequencing, assembly and annotation                    present reference genome were annotated by functional
The estimated genome size was approximately 2.64              databases (Supplementary Table 5). We identified 1417
Gb (with 68.57% repetitive sequences and 1.45% het-           noncoding RNAs classified into 44 families: 16 families
The chrysanthemum lavandulifolium genome and the molecular mechanism underlying diverse capitulum types - Oxford Academic
Wen et al    |   3

                                                                                                                                                                 Downloaded from https://academic.oup.com/hr/article/doi/10.1093/hr/uhab022/6510191 by guest on 04 March 2022
Figure 1. Overview and genome evolution of the C. lavandulifolium genome. a, Genomic features of the C. lavandulifolium genome and the expression
profile data during capitulum development. I, density of repetitive sequences; II, GC content; III, gene density; IV, heterozygosity; V, expression of
differentially expressed genes in stage 5 versus stage 6 of capitulum development (stage 5, the formation stage of disc f loret primordia, in which the
f loret primordia began to initiate; stage 6, the formation stage of ray f loret primordia, in which the disc and ray f loret primordia began to differentiate
on the capitulum); and VI, collinearity block between different pseudochromosomes. b, Phylogenetic gene tree of C. lavandulifolium with 10 other
plant species (Amborella trichopoda, Arabidopsis thaliana, Coffee arabica, Cynara cardunculus, Lactuca sativa, Taraxacum kok-saghyz, Helianthus
annuus, Artemisia annua, Erigeron breviscapus, C. nankingense, and C. lavandulifolium). c, Analysis of intact LTR numbers and insertion time in 7
Asteraceae plants. d, Analysis of Copia and Gypsy copy numbers in C. lavandulifolium. e, Synonymous substitution rate (Ks) distribution for pairs of
syntenic paralogs in C. lavandulifolium and two other plants (H. annuus and C. cardunculus).

of miRNA, 4 families of rRNA, and 24 families of tRNA                            Evolution of the C. lavandulifolium genome
(Supplementary Table 6). A total of 12 097 pseudogenes                           To study the conservation and specificity of the genomic
were identified using GeneWise. Benchmarking Universal                           structure of C. lavandulifolium, clustering of the predicted
Single-Copy Orthologs (BUSCOs) evaluation showed that                            proteins in the C. lavandulifolium genome with those from
89.02% and 92.36% of complete genes were obtained                                4 other representative Asteraceae species showed 11 419
in genome mode and protein mode, respectively, which                             gene families shared by 5 species, with 2750 gene families
suggested the high quality of our assembled C. lavanduli-                        specific to C. lavandulifolium (Supplementary Figure 4).
folium genome (Supplementary Table 7).                                           Gene Ontology (GO) enrichment analysis revealed that
   Based on the high-quality reference genome in this                            C. lavandulifolium-specific genes were mainly enriched
study, 1.76 Gb of repetitive sequences of C. lavandulifolium                     in cellular process (GO: 0044763), metabolic process
were predicted, with 60.25% being retrotransposons and                           (GO: 0044710) and catalytic activity (GO: 0003824)
3.5% DNA transposons (Supplementary Table 8). Retro-                             (Supplementary Table 9). Furthermore, a phylogenetic
transposons are the main components of transposons,                              tree was constructed with 166 single-copy genes in
with Copia and Gypsy (37.92% and 29.06%, respectively)                           C. lavandulifolium, and the other 10 species showed
being the most common. Compared with other Aster-                                that C. lavandulifolium diverged from C. nankingense at
aceae species, LTR expansion of C. lavandulifolium was                           approximately 7.2 Mya (Figure 1b). We also compared
detected at ∼1.25 Mya, which was very close to that of                           gene family expansion and contraction among the 11
C. nankingense (∼1.45 Mya) (Figure 1c). Bursts of Copia                          species to examine the evolution of the C. lavandulifolium
and Gypsy occurred at ∼1.25 Mya, which was consis-                               genome (Figure 1b). The results showed that 1305 and
tent with the LTR expansion time of C. lavandulifolium                           453 gene families were expanded and contracted in C.
(Figure 1d). This result indicated that LTR expansion in C.                      lavandulifolium, respectively (Figure 1b). The gene families
lavandulifolium was mainly driven by bursts of Copia and                         that were expanded in the C. lavandulifolium genome were
Gypsy.                                                                           enriched in flower development-related GO terms and
The chrysanthemum lavandulifolium genome and the molecular mechanism underlying diverse capitulum types - Oxford Academic
4   |   Horticulture Research, 2022, 9: uhab022

cell synthesis-related GO terms (Supplementary Table          expressed genes between the two samples are shown
10).                                                          in Supplementary Fig. 9b. Genes with an adjusted P
   Whole-genome duplication (WGD) is one of the most          value
Wen et al    |   5

                                                                                                                                                                 Downloaded from https://academic.oup.com/hr/article/doi/10.1093/hr/uhab022/6510191 by guest on 04 March 2022
Figure 2. Transcriptomic profiling analysis of six important stages of capitulum development in C. lavandulifolium. a, The morphology of six
developmental stages across samples. Stage 1 (S1), vegetative stage; stage 2 (S2), doming stage; stage 5 (S5), the initiation stage of disc f loret primordia;
stage 6 (S6), the initiation stage of ray f loret primordia, at which stage disc and ray f loret primordia began to differentiate; stage 9 (S9), the middle
stage of corolla primordia differentiation; and stage 10 (S10), the final stage of corolla primordia differentiation. FP, foliage primordia; Br, bract; DFP,
disc f loret primordia; and RFP, ray f loret primordia. b, Weighted gene coexpression network analysis of six developmental stages of f lowers and leaves
in C. lavandulifolium. c, Expression patterns of genes in gray modules that might be involved in the development of stage 6. d, Candidate hub genes
involved in the genetic regulatory networks of stage 6 (gray). Yellow triangles represent the cis-regulatory motifs of those genes, green circles represent
the gene ID number, and blue hexagons represent the GO terms that were enriched. e, Heatmap of NAM/CUC-LIKE and LOB30-LIKE at different
developmental stages of C. lavandulifolium.

of them expanded during the evolutionary history of the                          only disc floret primordia began to initiate (Figure 4a and
C. lavandulifolium genome (Supplementary Figure 16 and                           Supplementary Figure 17). CAL was located on chr05, and
Supplementary Table 15). In our studies, B- and C-class                          SEPa was located on chr02 (Supplementary Table 14). The
genes were highly expressed when the second and third                            A-class gene FRUITFULL (FUL, EVM0011418) was mainly
whorls of disc and ray f lorets began to initiate (Figure 4a,                    expressed at stages 5 and 6, especially at stage 6, when
Supplementary Figures 17 and 19). We identified three A-                         ray floret primordia began to initiate on the capitulum
class genes and four E-class genes expressed during early                        (Figure 4a and Supplementary Figure 17). These results
capitulum development (Figure 4a, Supplementary Fig-                             indicated that the functions of CAL, SEPa and FUL might
ures 17 and 18). Notably, the A-class gene CAULIFLOWER                           regulate the differentiation of disc and ray florets.
(CAL, EVM0046680) and E-class gene SEPELLATAa (SEPa,                                CYC2-LIKE genes are known to regulate the symmetry
EVM0006753) were highly expressed at stage 5, when                               of flowers [26]. Homology analysis showed that there
6   |   Horticulture Research, 2022, 9: uhab022

                                                                                                                                                                    Downloaded from https://academic.oup.com/hr/article/doi/10.1093/hr/uhab022/6510191 by guest on 04 March 2022
Figure 3. The expression patterns of NAM/CUC and LOB30 homologous genes in ten Asteraceae species. Notes: Transcriptomic profiling of
development stages in radiate, discoid and ligulate capitula. Stage 6 of the radiate capitulum has both disc and ray f loret primordia, stage 6 of the
discoid capitulum only has disc f loret primordia, and stage 6 of the ligulate capitulum only has ray f loret primordia. RF, ray f lorets; DF, disc f lorets; FP,
foliage primordia; Br, bract; DFP, disc f loret primordia; and RFP, ray f loret primordia.

Figure 4. The probable gene regulation mechanism in the development of different capitula types. a, The expression patterns of ABCE-class and
CYC2-LIKE genes in C. lavandulifolium during capitulum development. b, Gene and protein interactions involved in different capitula types. The dark
and red solid lines represent protein interactions predicted by STRING. The blue solid lines represent the protein interaction verified by experiments in
Wen et al. 2019. The red dotted lines represent protein interactions during capitulum development in Asteraceae.

were eight CYC2-LIKE genes in the C. lavandulifolium                               showed increasing expression with the development
genome; these genes were mainly distributed on chr06,                              process, except CYC2a1 (EVM0062019), which was mainly
chr07 and chr08 (Supplementary Figure 20, Supplemen-                               expressed at vegetative stage 1 (Figure 4a and Supple-
tary Tables 14 and 16). Most of the CYC2-like genes                                mentary Figure 21). Moreover, tandem duplication events
Wen et al   |   7

in the C. lavandulifolium genome drove the duplication of        using Nanopore and PacBio technologies with assis-
CYC2c/2d/2e/2f , and this gene set might have undergone          tance from a Hi-C heatmap. The reference genome of
subfunctionalization during the evolution of the C.              C. lavandulifolium that we obtained displayed higher
lavandulifolium genome (Supplementary Figure 22). CYC2c          integrality and accuracy than that for C. nankingense at
and CYC2d showed different expression patterns, which            the chromosome level3 . Compared with the assembled
indicated that they were subfunctionalized (Figure 4a            genomes of other Asteraceae species, the scaffold N50 of
and Supplementary Figure 23). The expression patterns            the C. lavandulifolium genome was the longest, at up to
of CYC2e and CYC2f were similar, which indicated that            300 Mb.
these two genes had redundant functions in regulating               Flower crops with new and unique flower types could
capitulum development (Figure 4a and Supplementary               have great economic value, and flower type modification
Figure 21). Notably, the duplication of CYC2a was unique         is an important goal of ornamental breeding. Genetic
in the genus Chrysanthemum, and both copies were                 manipulation of floral development-related genes is

                                                                                                                                  Downloaded from https://academic.oup.com/hr/article/doi/10.1093/hr/uhab022/6510191 by guest on 04 March 2022
expressed during the early capitulum development                 an effective method to directionally modify flower
stage, especially CYC2a2 (EVM0076812), which was highly          types. Previous studies have found that modifying
expressed at stage 5 (Figure 4a, Supplementary Figures           the expression of floral-related genes, such as MADS-
21 and 22). This result suggested that CYC2a2 might              box and TCP transcription factors, could directionally
be a unique gene in regulating capitulum development             improve the floral types [15]. These studies mainly
in the genus Chrysanthemum. In summary, five ABCE-               concentrate on ABCE-class and CYC2-LIKE genes to
class and CYC2-LIKE genes were mainly expressed at               influence the identity of floral organs and the symmetry
stages 5 and 6, which indicated that they might be               of a single flower [25]. However, the flowers of some
involved in disc and ray f loret differentiation (Figure 4a).    plants condense to form inflorescences, which is called
However, homologous genes of ABCE-class and CYC2-                “pseudanthium”, that usually have higher ornamental
LIKE in other Asteraceae species did not show obvious            value. The regulation of this complex inflorescence
expression differences among different capitula types            remains to be clarified. Chrysanthemums, as one of the
(Supplementary Figure 23).                                       most valuable ornamental plants, are famous for their
   Protein interactions among these candidate genes              diverse capitulum types. The modification of capitulum
were predicted by STRINGS and previous studies                   types in chrysanthemum will also provide insights for
(Figure 4b). In addition to ABCE-class and CYC2-LIKE             other inflorescence type modifications. However, the
genes, inf lorescence meristem-related genes (FT, TFL1           molecular mechanism of capitulum development is
and LFY) are also involved in capitulum development.             hindered by the complexity of flower types and the lack
However, the expression levels of FT, TFL1 and TFL at stage      of genomic data.
5 and stage 6 were relatively lower than those at other             The expression patterns of ABCE-class genes in the
developmental stages (Supplementary Figure 24). In the           capitulum developmental process are different from
discoid capitulum, CUC2 interacted with LFY and AG to            those in a single flower, but their functions are relatively
regulate the initiation of disc f loret primordia (Figure 4b).   conserved and mainly play roles in regulating the
When CUC2 interacted with CUC3, the expression of                identity of the four floral organ whorls during disc and
these two genes could promote the development of floret          ray floret development [15]. The present study found
primordia into ray f lorets (Figure 4b). During radiate          that most of the ABCE-class genes began to be highly
capitulum development, the existence of NAM and                  expressed during stage 9 - stage 10, the stages in which
LOB30 contributed to the differentiation of ray and disc         the floral organs initiate on the disc and ray florets. The
floret primordia. LOB30 could interact with LFY, TFL1,           CYC2-LIKE genes in Asteraceae expanded significantly.
CUC2, CUC3 and NAM, indicating its hub role in the               CYC2-LIKE genes have been subfunctionalized and
genetic regulatory network (Figure 4b). Overall, NAM and         neofunctionalized in regulating the differentiation of
LOB interacted with not only inf lorescence meristem-            disc and ray florets [27]. Previous transgenic studies of
related genes (LFY) but also f loral organ identity genes        CYC2c and CYC2d in C. lavandulifolium have changed the
and CYC2-LIKE genes during capitulum development,                length of ray florets to some extent, which indicates that
indicating the hub roles of NAM and LOB30 in the                 CYC2-LIKE regulates floret development via the control of
differentiation of disc and ray f lorets on the radiate          dorsal petal elongation [27, 28]. Combined with our wide
capitulum.                                                       survey of CYC2-LIKE genes during the key developmental
                                                                 stage of C. lavandulifolium capitulum, CYC2e/2f may
                                                                 be redundant with CYC2c/2d, and CYC2a may have
Discussion                                                       evolved a new function in the genus Chrysanthemum.
To date, the genomes of 16 Asteraceae species have               Overall, the gene expression and evolution at the
been sequenced, including L. sativa, E. breviscapus,             genome level showed that the ABCE-class and CYC2-LIKE
Artemisia annua, C. nankingense, C. seticuspe, H. annuus,        genes contributed to the capitulum development of C.
and Mikania micrantha (Supplementary Table 1). In                lavandulifolium.
this study, a high-quality chromosomal-scale reference              However, transgenic studies of ABCE-class and
genome of C. lavandulifolium was successfully obtained           CYC2-LIKE genes in Gerbera, Senecio vulgaris and C.
8   |   Horticulture Research, 2022, 9: uhab022

lavandulifolium did not alter the identity of disc and ray   nk-mer represents the k-mer total number, and daverage
florets, although some of these genes could change           k-mer is the average k-mer depth). Genome sequencing
the morphology of the two kinds of f lorets or cause         was performed using SMRT sequencing on a PacBio RS
the complete loss of the inf lorescence meristem15 .         II sequencer (Pacific Biosciences, Menlo Park, CA, USA)
It appears that ABCE-class and CYC2-like genes in            following the manufacturer’s standard protocol, and
Asteraceae are not the hub genes in the regulatory           201.85 Gb PacBio data using SMRT analysis software
network of disc and ray f loret differentiation. WGCNA       v1.2 [34] were acquired. To further improve the genomic
showed that the hub genes regulating disc and ray            assembly quality, 193.88 Gb of clean data were generated
floret differentiation in the capitulum were NAM and         using a PromethION sequencer (Pacific Biosciences,
LOB30, and NAM/CUC and LOB30 subfamily genes not             Menlo Park, CA, USA).
only regulate the initiation and orientation of organ
primordia in early f lower development but also respond      Genome assembly and quality assessment

                                                                                                                         Downloaded from https://academic.oup.com/hr/article/doi/10.1093/hr/uhab022/6510191 by guest on 04 March 2022
to inf lorescence meristem-related genes and floral          For the PacBio RSII platform data, longer subreads were
identity genes [29–31]. Based on the hub roles of NAM        selected by the error correction module of canu v1.5
and LOB30 in stages 5 and 6, we concluded that the           [35]. Raw overlapping subreads were detected through
two genes might regulate the differentiation of disc and     the highly sensitive overlap detection program MHAP
ray f loret primordia in early capitulum development         v2.1 [36], and the error correction of these data was
and that they could interact with downstream genes           carried out by the Falcon sense method v0.40 (“correct-
to regulate the f loral organ identity and development of    edErrorRate = 0.025”) [37]. The error-corrected subreads
each f loret on the capitulum. Duplication of NAM and its    were used to generate a draft assembly in WTDBG v2.5
chromosome position in the C. lavandulifolium genome         (https://github.com/ruanjue/wtdbg). Iterative polishing
indicated that NAM might be a special key gene in            by Pilon v1.22 [38] was achieved by aligning adapter-
regulating the diverse capitula types of chrysanthemum.      trimmed and paired-end Illumina reads to the PacBio
Protein interactions showed that these hub genes could       draft genome. Clean nanopore data were acquired via
interact with LFY, which has been proven to regulate         sequencing on the PromethION platform, and these data
ray floret development in Gerbera [31]. Previous studies     were corrected with the same method described above.
supported the idea that the capitulum was derived from       The draft genome assembled by WTDBG v2.5 (https://
a cyme in which peripheral branches were inhibited           github.com/ruanjue/wtdbg) was corrected three times
[32]. The interaction of NAM/CUC and LOB30 regulated         by Racon v1.3.3 [39] by aligning adapter-trimmed and
the expression of LFY to prevent the development of          paired-end Illumina reads to the Nanopore draft genome.
peripheral branches and promote peripheral ray floret        Then, the PacBio draft genome as a query input was
primordia in different capitula types. In conclusion,        aligned against the Nanopore draft genome using MUM-
the C. lavandulifolium genome presented in this study        mer v4.0.0 [40]. The PacBio draft genome and Nanopore
provides a powerful reference for the further assembly       draft genome were then merged using quickmerge v0.3.0
of complex genomes, especially for the deciphering           [21]. This merged draft genome was polished by Racon
of chrysanthemum genomes. Based on comparative               v1.3.3 [39] and Pilon v1.22 [38]. The mapping depth was
genomic and transcriptomic analyses, we identified hub       obtained by aligning corrected Nanopore sequencing
genes that might be involved in the identity of disc and     data to the merged assembly by minimap2 v2.17 [41]
ray f lorets, which could serve as candidate genes for       with default parameters. Then, purge haplotigs v1.0.4
further genome editing to modify the capitula types in       [22] was used to eliminate redundancy according to the
chrysanthemum.                                               coverage depth and obtain the purged haplotig genome.
                                                             Ultimately, a C. lavandulifolium genome with a total
                                                             length of 3.10 Gb was obtained. The second-generation
Materials and methods                                        sequencing data, core gene completeness and BUSCOs
Plant materials and sequencing                               were evaluated to verify the accuracy of the genome
The C. lavandulifolium G1 line was collected and cultured    assembly.
at Beijing Forestry University for genomic sequencing
[33]. The genomic DNA of the C. lavandulifolium G1 line      Hi-C sequencing and assistant assembly
was extracted using a standard CTAB protocol. The            Hi-C is a technology derived from chromosome
paired-end libraries were sequenced on the Illumina          conformation capture technology that utilizes high-
HiSeqTM 4000 sequencing platform (Illumina, San Diego,       throughput sequencing data and is mainly used to assist
CA, USA). The 27-mer frequencies were generated using        in genome assembly. We constructed Hi-C fragment
135.61 Gb of high-quality PE reads (51.43 ×), and a total    libraries with insert sizes of 300–700 bp, as illustrated
of 1 × e11 k-mers were obtained by a customized Perl         in Rao et al. [42], and sequenced them using the Illumina
script. The main peak value representing the average         platform [42]. Before chromosome assembly, we first
k-mer depth was 38. A modified formula was used              performed a preassembly for the error correction of
to estimate the C. lavandulifolium genome size, G =          scaffolds, which required splitting the scaffolds into
nk-mer/daverage k-mer (G represents the genome size,         segments of 50 kb, on average. Then, the Hi-C data
Wen et al   |   9

were mapped to these segments using BWA aligner               GeneMarkS-T (version 5.1) [56] and TransDecoder v2.0
v0.7.10-r789 [43]. We retained the uniquely mapped            (http://transdecoder.github.io). Finally, EVM v1.1.1 [57]
data to assemble the genome using LACHESIS [44] with          was used to integrate the prediction results, with
the following parameters: CLUSTER_MIN_RE_SITES = 80;          the following parameters: —min_intron_length 2 —
CLUSTER_MAX_LINK_DENSITY = 2;                                 terminal_intergenic_re_search 10 000. MicroRNA and
CLUSTER_NONINFORMATIVE_RATIO = 2;                             rRNA were identified by BLAST with 1e-10 based on
ORDER_MIN_N_RES_IN_TRUN = 16;                    and          the Rfam database [58]. Transfer RNAs (tRNAs) were
ORDER_MIN_N_RES_IN_SHREDS = 16. To further address            predicted using tRNAscan-SE [59]. Repetitive sequences
the redundant sequences, we manually checked any              were predicted using Repeat Masker [60].
two segments that showed inconsistent connections
with the raw scaffold. The detailed workf low schema
for the assembly pipeline of the chromosome-scale             Gene family identification, genome evolution

                                                                                                                               Downloaded from https://academic.oup.com/hr/article/doi/10.1093/hr/uhab022/6510191 by guest on 04 March 2022
C. lavandulifolium genome is shown in Supplementary           analysis and species tree construction
Figure 24.                                                    The alignments of protein sequences were performed
                                                              using Diamond v0.9.29.130 (http://www.diamondsea
Transcriptome sequencing                                      rch.org/index.php) [61] with an E-value of 0.001. Orthol-
The reproductive-stage leaf and developmental series          ogous and paralogous gene families were identified
of capitula (stage 1, stage 2, stage 5, stage 6, stage        by OrthoFinder v2.3.7 [62] with default parameters. A
9 and stage 10) of the C. lavandulifolium G1 line were        phylogenetic tree based on the concatenated sequence
sampled to establish transcriptomic profiling (Figure 2a,     alignment of 166 single-copy gene families from C. lavan-
Supplementary Table 12). The reproductive-stage leaves,       dulifolium and 10 other plant species was constructed
vegetative buds and reproductive buds of nine other           using IQ-TREE with the selected optimal sequence
species of Asteraceae were also sampled to construct          evolution model (−m JTT + F + R5) and with ultrafast
libraries for RNA-seq (Figure 2b, Supplementary Table 13      bootstrapping [63]. MCMCTREE of PAML (v4.9) was used
and Supplementary Figure 15). Total RNA was extracted         to estimate the divergence times [64]. Ks-based age dis-
using a Plant RNA Rapid Extraction Kit (HUAYUEYANG            tributions were analyzed by using PAML to calculate the
Biotechnology, Beijing, China) and treated with RNase-        synonymous mutation rate (Ks) values. LTR sequences
free DNase I to digest the DNA. After assessing the purity    were identified and filtered using LTR_FINDER v1.07
and integrity of RNA using the Agilent 2100 Bioanalyzer       (score = 6) [65]. Then, the flanking sequences of the LTRs
(Agilent Technologies, Palo Alto, Calif.) and the ABI         were extracted and compared using MAFFT (parameters:
StepOnePlus Real-Time PCR System (Applied Biosystems,         —localpair —maxiterate 1000) [66]. The distance K was
Waltham, MA, USA), the constructed libraries were             calculated by the Kimura model using EMBOSS v6.6.0
sequenced on an Illumina HiSeqTM 2500 sequencing plat-        [67]. The formula for calculating time is T = K/(2 × r) with
form (Illumina, San Diego, CA, UAS) [45]. The clean reads     the molecular clock r = 7 × 10−9 mutations per site per
were aligned to our de novo genome of C. lavandulifolium      year.
using TopHat2 (version 2.0.7) [46] and then assembled
using Cuff links [47] after removing the connectors of the
low-quality sequences and raw reads. The protein-coding       WGCNA of flower development in C.
genes were annotated against the NCBI NR (http://www.         lavandulifolium
ncbi.nlm.nih.gov), SwissProt, GO, COG, KOG, eggNOG, and       To analyze genes involved in the six capitulum develop-
KEGG databases. Gene expression was calculated using          mental stages of C. lavandulifolium, weighted correlation
Cuffquan and CuffnormGene in Cuff links [47].                 network analysis (WGCNA) was performed using the R
                                                              package [68]. The soft thresholding power was set to 7
Gene annotation                                               to construct an adjacency matrix of genes with different
To better predict the protein-coding genes, a pipeline that   expression patterns, and the topological overlap matrix
combined de novo gene prediction, unigene prediction          (TOM) similarity algorithm was used to transform the
and homologous species prediction was used. For de novo       adjacency matrix into a topological overlap matrix to
prediction, Genscan [48], Augustus v2.4 [49], Glimmer         reduce noise and false correlations. Then, all DEGs were
HMM v3.0.4 [50], GeneID v1.4 [51], and SNAP [52] were         hierarchically clustered based on TOM similarity. Hier-
used; for homology prediction, GeneWise v2.2.0 [53] was       archical clustering was performed by Dynamic Hybrid
used with C. lavandulifolium protein sequences, and a         Tree Cut [69]. The genes in different colored modules
minimum of 50% coverage was set to the determined             were converted to module eigengenes using the first
value using gene models. For unigene prediction,              principal component. The different capitulum develop-
Illumina reads were filtered to remove adaptors and           ment stages in C. lavandulifolium were also correlated with
trimmed to remove low-quality bases. Processed reads          the eigengenes of each module to find the key mod-
were aligned to the reference genome, and then the            ule associated with capitulum development. The expres-
transcripts were assembled using Hisatv2.0.4 [54] and         sion heatmap of candidate genes was constructed using
Stringtie [55]. Coding sequences were predicted using         TBtools [70].
10   |   Horticulture Research, 2022, 9: uhab022

Evolution and expression analysis of key                                      chrysanthemum (chrysanthemum × morifolium Ramat.). Horticul-
candidate gene families                                                       ture Research. 2020;7:108.
The evolution and expression of genes related to capitu-                 2.   He SM, Dong X, Zhang G et al. High quality genome of Erigeron
                                                                              breviscapus provides a reference for herbal plants in Asteraceae.
lum development in C. lavandulifolium were investigated.
                                                                              Mol Ecol Resour. 2020;00:1–17.
Based on the previously described results for transcrip-
                                                                         3.   Song C, Liu Y, Song A et al. The chrysanthemum nankingense
tome analysis, we chose key gene families for further
                                                                              genome provides insights into the evolution and diversifica-
analysis. The protein sequences of those candidate gene                       tion of chrysanthemum flowers and medicinal traits. Mol Plant.
families were scanned using BLASTP and HMMER. For                             2018;11:1482–91.
BLASTP and HMMER, initial gene sets were filtered with                   4.   Hirakawa H, Sumitomo K, Hisamatsu T et al. De novo whole-
a default cutoff E-value of 1e-5. Then, phylogenetic trees                    genome assembly in chrysanthemum seticuspe, a model species
were established using FastTree (v2.1) [71] and modified                      of chrysanthemums, and its application to genetic and gene
by ITOL (https://itol.embl.de/).                                              discovery analysis. DNA Res. 2019;26:195–203.

                                                                                                                                                  Downloaded from https://academic.oup.com/hr/article/doi/10.1093/hr/uhab022/6510191 by guest on 04 March 2022
                                                                         5.   Wellmer F, Riechmann JL. Gene networks controlling the initia-
                                                                              tion of flower development. Trends Genet. 2010;26:519–527.
Acknowledgments                                                          6.   Thomson B, Wellmer F. Molecular regulation of flower develop-
This work was supported by grants from the National                           ment. Curr Top Dev Biol. 2019;131:185–210.
                                                                         7.   Krizek BA, Fletcher JC. Molecular mechanisms of flower devel-
Natural Science Foundation of China (No. 31530064)
                                                                              opment: an armchair guide. Nature Rev Genet. 2005;6:688–98.
and National Key Research and Development Project
                                                                         8.   Chen F, Song Y, Li X et al. Genome sequences of horticul-
(2018YFD1000403). We are particularly thankful to Xia                         tural plants: past, present, and future. Horticulture Research.
Xu and Guanghui Zhang for providing T. kok-saghyz and                         2019;6:112.
E. breviscapus. We are also grateful to Hongqing Ling and                9.   Krizek BA. eLS. London: Wiley; 2020.
Yalong Guo for their helpful suggestions on our work.                   10.   Zhang QG, Liu KW, Li Z et al. The Apostasia genome and the
                                                                              evolution of orchids. Nature. 2017;549:379–83.
                                                                        11.   Li MM, Zhang D, Gao Q et al. Genome structure and evolution of
Author contributions                                                          Antirrhinum majus L. Nature Plants. 2019;5:174–83.
S.L.D., L.S.Z., Y.N.J., and X.H.W. conceived and designed               12.   Zhang LS, Chen F, Zhang X et al. The water lily genome and the
the study. S.L.D., L.S.Z., and J.Y.N. discussed and modi-                     early evolution of flowering plants. Nature. 2020;577:79–84.

fied the study results. X.H.W., J.Z.L., Q.G., P.X., C.F.L, Y.P.,        13. Ding L, Song A, Zhang X et al. The core regulatory networks
                                                                            and hub genes regulating flower development in Chrysanthemum
H.L., Q.L.Z., X.Y.W., C.F.M., and K.G. prepared the materi-
                                                                            morifolium. Plant Mol Biol. 2020;103:669–88.
als, conducted the experiments, analyzed the data and
                                                                        14. Elomaa P, Zhao Y, Zhang T et al. Flower heads in Asteraceae -
prepared the results. X.H.W., J.Z.L., and L.C.F. wrote the                  recruitment of conserved developmental regulators to control
manuscript. K.D.R. and Y.H. were involved in data inter-                    the flower-like inflorescence architecture. Horticulture research.
pretation and finalizing the manuscript draft. All authors                  2018;5:1–10.
read and approved the final draft.                                      15. Zoulias N, Duttke SHC, Garcês H et al. Auxin and pattern for-
                                                                            mation of the Asteraceae flower head (capitulum). Plant Physiol.
                                                                            2019;179:391–401.
Data availability                                                       16. Li F, Lan W, Zhou Q et al. Reduced expression of CbUFO is
The raw sequence data and assembly of C. lavandulifolium                    associated with the phenotype of a flower-defective Cosmos
genome sequencing and RNA sequencing have been                              bipinnatus. Int J Mol Sci. 2019;20:2503.
                                                                        17. Dai SL, Zhang CJ, Chen J et al. Advances of researches on phy-
deposited in NCBI (PRJNA681093). The final assembly and
                                                                            logeny of Dendranthema and origin of chrysanthemum. Journal of
annotation of the C. lavandulifolium genome are available
                                                                            Beijing Forestry University. 2002;24:234–8.
at GenBank under accession number JAHFWF000000000.
                                                                        18. Yang LW, Wen XH, Fu JX et al. ClCRY2 facilitates floral transition
The accession numbers of other transcriptome data                           in chrysanthemum lavandulifolium by affecting the transcription
sequenced in the present study are shown in Supple-                         of circadian clock-related genes under short-day photoperiods.
mentary Table 13.                                                           Horticulture Research. 2018;5:58.
                                                                        19. Wen XH, Qi S, Huang H et al. The expression and interactions of
Competing interests                                                         ABCE-class and CYC2-like genes in the capitulum development of
                                                                            chrysanthemum lavandulifolium and C. × morifolium. Plant Growth
The authors declare no competing interests.                                 Regul. 2019;88:205.
                                                                        20. Qi S, Yang L, Wen X et al. Reference gene selection for RT-
Supplementary data                                                          qPCR analysis of flower development in Chrysanthemum mori-
                                                                            folium and chrysanthemum lavandulifolium. Front Plant Sci. 2016;7:
Supplementary data is available at Horticulture Research                    651.
Journal online.                                                         21. Chakraborty M, Baldwin-Brown JG, Long AD et al. Contiguous
                                                                            and accurate de novo assembly of metazoan genomes with
                                                                            modest long read coverage. Nucleic Acids Res. 2016;44:e147.
References
                                                                        22. Roach MJ, Schmidt SA, Borneman AR. Purge Haplotigs: allelic
 1. Song XB, Xu Y, Gao K et al. High-density genetic map construc-          contig reassignment for third-gen diploid genome assemblies.
    tion and identification of loci controlling flower-type traits in       BMC Bioinformatics. 2018;19:460.
Wen et al    |   11

23. Qiao X, Li Q, Yin H et al. Gene duplication and evolution in recur-    44. Burton JN, Adey A, Patwardhan RP et al. Chromosome-scale
    ring polyploidization - diploidization cycles in plants. Genome            scaffolding of de novo genome assemblies based on chromatin
    Biol. 2019;20:38.                                                          interactions. Nat Biotechnol. 2013;31:1119–25.
24. Badouin H, Gouzy J, Grassa CJ et al. The sunflower genome              45. Singh KS, Wu Y, Ghosh JS et al. OPEN RNA-sequencing reveals
    provides insights into oil metabolism, flowering and Asterid               global transcriptomic changes in Nicotiana tabacum responding
    evolution. Nature. 2017;546:7675.                                          to topping and treatment of axillary-shoot control chemicals.
25. Liu B, Yan J, Li W et al. Mikania micrantha genome provides                Sci Rep. 2016;5:18148.
    insights into the molecular mechanism of rapid growth. Nature.         46. Kim D, Pertea G, Trapnell C et al. TopHat2: accurate alignment of
    Communications. 2020;11:1.                                                 transcriptomes in the presence of insertions, deletions and gene
26. Spencer V, Kim M. Re"CYC"ling molecular regulators in the                  fusions. Genome Biol. 2013;14:4.
    evolution and development of flower symmetry. Semin Cell Dev           47. Trapnell C, Roberts A, Goff L et al. Differential gene and transcript
    Biol. 2018;79:16–26.                                                       expression analysis of RNA-seq experiments with TopHat and
27. Chen J, Shen CZ, Guo YP et al. Patterning the Asteraceae Capit-            cufflinks. Nat Protoc. 2012;7:562–78.

                                                                                                                                                       Downloaded from https://academic.oup.com/hr/article/doi/10.1093/hr/uhab022/6510191 by guest on 04 March 2022
    ulum: duplications and differential expression of the flower           48. Burge C, Karlin S. Prediction of complete gene structures in
    symmetry CYC2-like genes. Frontiers. Plant Sci. 2018;9:1–14.               human genomic DNA. J Mol Biol. 1997;268:78–94.
28. Huang CH, Zhang C, Liu M et al. Multiple polyploidization events       49. Stanke M, Waack S. Gene prediction with a hidden Markov
    across Asteraceae with two nested events in the early his-                 model and a new intron submodel. Bioinformatics. 2003;19:
    tory revealed by nuclear phylogenomics. Mol Biol Evol. 2016;33:            ii215–25.
    2820–35.                                                               50. Majoros WH, Pertea M, Salzberg SL. TigrScan and GlimmerHMM:
29. Zadnikova P, Simon R. How boundaries control plant develop-                two open source ab initio eukaryotic gene-finders. Bioinformatics.
    ment. Curr Opin Plant Biol. 2014;17:116–25.                                2004;20:2878–9.
30. Mara C, Manrique S, Cuesta C et al. CUP-SHAPED COTYLEDON1              51. Blanco E, Genís P, Roderic G. Using GENEID to identify genes. Curr
    (CUC1) and CUC2 regulate cytokinin homeostasis to determine                Protoc Bioinformatics. 2007;18:1.
    ovule number in Arabidopsis. J Exp Bot. 2018;69:5169–76.               52. Korf I. Gene finding in novel genomes. BMC bioinformatics.
31. Rebocho AB, Kennaway JR, Bangham JA et al. Formation and                   2004;5:59.
    shaping of the antirrhinum flower through modulation of the            53. Birney E, Clamp M, Durbin R. GeneWise and Genomewise.
    CUP boundary gene. Curr Biol. 2017;27:2610–2622.e3.                        Genome Res. 2004;14:988–95.
32. Zhao Y, Zhang T, Broholm SK et al. Evolutionary co-option              54. Kim D, Langmead B, Salzberg SL. HISAT: a fast-spliced aligner
    of floral meristem identity genes for patterning of the                    with low memory requirements. Nat Methods. 2015;12:357–60.
    flower-like Asteraceae inflorescence. Plant Physiol. 2016;172:         55. Pertea M, Pertea GM, Antonescu CM et al. StringTie enables
    284–96.                                                                    improved reconstruction of a transcriptome from RNA-seq
33. Wen XH, Pu Y, Liu Y. Effects of N, P and K nutrients on the growth         reads. Bio/technology (Nature Publishing Company). 2015;33:
    and development of chrysanthemum lavandulifolium based on                  290–5.
    BBCH scale. Advanced in Ornamental Horticulture of China. 2019;1:      56. Tang SYY, Alexandre L, Mark B. Identification of protein cod-
    76–84.                                                                     ing regions in RNA transcripts. Nucleic Acids Symp Ser. 2015;43:
34. Ramsköld D, Luo S, Wang YC et al. Full-length mRNA-Seq from                e78.
    single-cell levels of RNA and individual circulating tumor cells.      57. Haas BJ, Salzberg SL, Zhu W et al. Automated eukaryotic gene
    Nat Biotechnol. 2012;30:777–82.                                            structure annotation using EVidenceModeler and the program
35. Koren S, Walenz BP, Berlin K et al. Canu: scalable and accurate            to assemble spliced alignments. Genome Biol. 2008;9:R7.
    long-read assembly via adaptive k-mer weighting and repeat             58. Griffiths-Jones S, Moxon S, Marshall M et al. Rfam: annotat-
    separation. Genome Res. 2017;27:722–36.                                    ing non-coding RNAs in complete genomes. Nucleic Acids Res.
36. Drake JP, Berlin K, Koren S et al. Assembling large genomes with           2004;33:D121–4.
    single-molecule sequencing and locality-sensitive hashing. Nat         59. Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detec-
    Biotechnol. 2015;33:623–30.                                                tion of transfer RNA genes in genomic sequence. Nucleic Acids
37. Chin CS, Peluso P, Sedlazeck FJ et al. Phased diploid genome               Res. 1997;25:955–64.
    assembly with single-molecule real-time sequencing. Nat Meth-          60. Tarailo-Graovac M, Chen N. Using RepeatMasker to identify
    ods. 2016;13:1050–4.                                                       repetitive elements in genomic sequences. Curr Protoc Bioinfor-
38. Walker BJ, Abeel T, Shea T et al. Pilon: an integrated tool for com-       matics. 2009;25:1.
    prehensive microbial variant detection and genome assembly             61. Buchfink B, Xie C, Huson DH. Fast and sensitive protein align-
    improvement. PLoS One. 2014;9:e112963.                                     ment using diamond. Nat Methods. 2015;12:59–60.
39. Vaser R, Sović I, Nagarajan N et al. Fast and accurate de novo        62. Emms DM, Kelly S. OrthoFinder: solving fundamental biases in
    genome assembly from long uncorrected reads. Genome Res.                   whole genome comparisons dramatically improves orthogroup
    2017;27:737–46.                                                            inference accuracy. Genome Biol. 2015;16:157.
40. Marcais G, Delcher AL, Phillippy AM et al. MUMmer4: a fast             63. Lam-Tung N, Schmidt HA, von Haeseler A et al. IQ-TREE: a fast
    and versatile genome alignment system. PLoS Comput Biol.                   and effective stochastic algorithm for estimating maximum-
    2018;14:e1005944.                                                          likelihood phylogenies. Molecular Biology & Evolution. 2015;1:
41. Li H. Minimap2: pairwise alignment for nucleotide sequences.               268–74.
    Bioinformatics. 2018;34:3094–100.                                      64. Rannala YB. Bayesian estimation of species divergence times
42. Rao SS, Huntley MH, Durand NC et al. A 3D map of the human                 under a molecular clock using multiple fossil calibrations with
    genome at kilobase resolution reveals principles of chromatin              soft bounds. Mol Biol Evol. 2006;23:212–26.
    looping. Cell. 2014;159:1665–80.                                       65. Xu Z, Wang H. LTR_FINDER: an efficient tool for the pre-
43. Li H, Durbin R. Fast and accurate long-read alignment with                 diction of full-length LTR retrotransposons. Nucleic Acids Res.
    burrows-wheeler transform. Bioinformatics. 2010;26:589–95.                 2007;35:W265–8.
12   |   Horticulture Research, 2022, 9: uhab022

66. Katoh K, Standley DM. MAFFT multiple sequence alignment          69. Langfelder P, Zhang B, Horvath S et al. Defining clusters from
    software version 7: improvements in performance and usability.       a hierarchical cluster tree: the dynamic tree cut package for R.
    Molec Biol Evol. 2013;30:772–80.                                     Bioinformatics. 2008;24:719–20.
67. Rice P, Longden I, Bleasby A. EMBOSS: the European               70. Chen C, Chen H, Zhang Y et al. TBtools: an integrative toolkit
    molecular biology open software suite. Trends Genet. 2000;16:        developed for interactive analyses of big biological data. Mol
    276–7.                                                               Plant. 2020;13:1194.
68. Langfelder P, Horvath S. WGCNA: an R package for                 71. Morgan NP, Dehal PS, Arkin AP et al. FastTree: computing large
    weighted correlation network analysis. Bioinformatics. 2008;9:       minimum evolution trees with profiles instead of a distance
    559.                                                                 matrix. Mol Biol Evol. 2009;7:1641–50.

                                                                                                                                            Downloaded from https://academic.oup.com/hr/article/doi/10.1093/hr/uhab022/6510191 by guest on 04 March 2022
You can also read