GBE A Chromosome-Level Genome Assembly of the Reed Warbler (Acrocephalus scirpaceus)
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
GBE A Chromosome-Level Genome Assembly of the Reed Warbler (Acrocephalus scirpaceus) Camilla Lo Cascio Sætre1,*, Fabrice Eroukhmanoff1, Katja Rönk€a2,3, Edward Kluen2,3, Rose Thorogood2,3, James Torrance4, Alan Tracey4, William Chow4, Sarah Pelan4, Kerstin Howe4, Kjetill S. Jakobsen1, and Ole K. Tørresen1 1 Centre for Ecological and Evolutionary Synthesis, University of Oslo, Norway Downloaded from https://academic.oup.com/gbe/article/13/9/evab212/6367782 by guest on 18 December 2021 2 HiLIFE Helsinki Institute of Life Sciences, University of Helsinki, Finland 3 Research Programme in Organismal and Evolutionary Biology, Faculty of Biological and Environmental Sciences, University of Helsinki, Finland 4 Tree of Life, Wellcome Sanger Institute, Cambridge, United Kingdom *Corresponding author: E-mail: c.l.c.satre@ibv.uio.no. Accepted: 6 September 2021 Abstract The reed warbler (Acrocephalus scirpaceus) is a long-distance migrant passerine with a wide distribution across Eurasia. This species has fascinated researchers for decades, especially its role as host of a brood parasite, and its capacity for rapid phenotypic change in the face of climate change. Currently, it is expanding its range northwards in Europe, and is altering its migratory behavior in certain areas. Thus, there is great potential to discover signs of recent evolution and its impact on the genomic composition of the reed warbler. Here, we present a high-quality reference genome for the reed warbler, based on PacBio, 10, and Hi-C sequencing. The genome has an assembly size of 1,075,083,815 bp with a scaffold N50 of 74,438,198 bp and a contig N50 of 12,742,779 bp. BUSCO analysis using aves_odb10 as a model showed that 95.7% of BUSCO genes were complete. We found unequivocal evidence of two separate macrochromosomal fusions in the reed warbler genome, in addition to the previously identified fusion between chromosome Z and a part of chromosome 4A in the Sylvioidea superfamily. We annotated 14,645 protein-coding genes, and a BUSCO analysis of the protein sequences indicated 97.5% completeness. This reference genome will serve as an important resource, and will provide new insights into the genomic effects of evolutionary drivers such as coevolution, range expansion, and adaptations to climate change, as well as chromosomal rearrangements in birds. Key words: genome assembly, Hi-C sequencing, long reads, reference genome, Acrocephalus scirpaceus. . Significance The reed warbler (Acrocephalus scirpaceus) has been lacking a genomic resource, despite having been broadly researched in studies of coevolution, ecology, and adaptations to climate change. Here, we generated a chromo- some-length genome assembly of the reed warbler, and found evidence of macrochromosomal fusions in its genome, which are likely of recent origin. This genome will provide the opportunity for a deeper understanding of the evolution of genomes in birds, as well as the evolutionary path and possible future of the reed warbler. Introduction parasitic common cuckoo (Cuculus canorus) (Davies and The ecology and evolution of the reed warbler (Acrocephalus Brooke 1989; Stokke et al. 2018). Decades of field experiments scirpaceus) has been of interest for over 40 years (Thorogood et have demonstrated behavioral coevolution and spatial and al. 2019) as it is one of the favorite host species of the brood- temporal variation in species interactions (e.g., Thorogood ß The Author(s) 2021. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. Genome Biol. Evol. 13(9) doi:10.1093/gbe/evab212 Advance Access publication 9 September 2021 1
Sætre et al. GBE and Davies 2013). However, the reed warbler’s response to 277,617,608 paired-end reads (2150) with 10 Genomics, climate change has begun to attract increasing attention. and 185,974,525 paired-end reads (2150) with Hi-C, at Reed warblers are experiencing far less severe declines in pop- 83 and 56 coverage, respectively. The final genome as- ulation size than is typical for long-distance migrants (Both et al. sembly was 1.08 Gb in length, and contains 1,081 contigs 2010; Vickery et al. 2014). In fact, they are expanding their (contig N50 of 13 Mb) and 200 scaffolds (scaffold N50 of breeding range northwards into Fennoscandia (J€arvinen and 74 Mb) (table 1). The completeness of the assembled genome Ulfstrand 1980; Røed 1994; Stolt 1999; Brommer et al. is high: of the 8,338 universal avian single-copy orthologs, we 2012), and have generally increased their productivity follow- identified 7,978 complete BUSCOs (95.7%), including 7,920 ing the rise in temperature (Schaefer et al. 2006; Eglington et single-copy (95.0%), and 58 duplicated BUSCOs (0.7%). al. 2015; Meller et al. 2018). They are also showing rapid Fifty-nine BUSCOs (0.7%) were fragmented, and 301 Downloaded from https://academic.oup.com/gbe/article/13/9/evab212/6367782 by guest on 18 December 2021 changes in phenology (Halupka et al. 2008), and migratory BUSCOs (3.6%) were missing. behavior; instead of crossing the Sahara, monitoring suggests that some reed warblers now remain on the Iberian Peninsula over winter (Chamorro et al. 2019). Morphological traits such Genome Annotation as body mass and wing shape have been shown to change The GC content of the reed warbler genome assembly was rapidly in reed warbler populations, indicating possible local 41.9%. The total repeat content of the assembly was adaptation (Kralj et al. 2010; Salewski et al. 2010; Sætre et 10.94%, with LTR elements as the most common type of al. 2017). Genetic differentiation is generally low between reed repeat (4.50%) followed by LINEs (4.11%) (table 1). warbler populations, but moderate levels of differentiation Using the Comparative Annotation Toolkit, based on a have been connected to both migratory behavior (Prochazka whole-genome multiple alignments from Cactus, we pre- et al. 2011) and wing shape (Kralj et al. 2010). Reed warblers dicted 14,645 protein coding genes, with an average thus provide a promising system to study population, pheno- Coding DNA Sequence (CDS) length of 1,782 bp, and an av- typic, and genetic responses to climate change. erage intron length of 2,918 bp (table 1). The annotated genes Although there has been an increasing number of avian had 97.5% completeness (based on predicted proteins). genome assemblies in recent years (e.g., Feng et al. 2020), many nonmodel species, including the reed warbler, are still lacking a genome resource. To date, the closest relative to Synteny Analysis the reed warbler with a published reference genome is the The reed warbler genome showed high synteny with the great great tit (Parus major) (GCA_001522545.3, deposited in tit genome, though with some notable differences (fig. 1). The NCBI; Laine et al. 2016), but the unpublished genome of reed warbler chromosome 6 is a fusion of great tit chromo- the garden warbler (Sylvia borin) is available in public data- somes 7 and 8, and reed warbler chromosome 8 is a fusion of bases (GCA_014839755.1, deposited in NCBI). There is also great tit chromosomes 6 and 9. Interestingly, these chromo- a genome in preprint from the Acrocephalus genus, the somes are not fused in the garden warbler genome (supple- great reed warbler (A. arundinaceus) (Sigeman, Strandh, et mentary fig. 1, Supplementary Material online), but al. 2020), but the scaffolds are not chromosome length. correspond to the great tit chromosomes. This suggests that Here, we present the first genome assembly of the reed the fusions evolved relatively recently, perhaps at the base of warbler, based on PacBio, 10, and Hi-C sequencing, with the Acrocephalidae branch within Sylvioidea, but further re- descriptions of the assembly, manual curation, and annotation. search is needed to determine this. Hi-C contact maps confirm This genome will be a valuable resource for a number of studies, that the chromosomes assembled in the reed warbler genome including studies of coevolution, population genomics, adaptive are unbroken (supplementary fig. 2, Supplementary Material evolution, and comparative genomics. For reduced- online). Interchromosomal rearrangements are rare in avian representation sequencing (e.g., RAD-seq) studies, it will help evolution (Ellegren 2010; Skinner and Griffin 2012), with produce a more robust SNP set than with a de novo approach some exceptions, such as in the orders Falconiformes (Shafer et al. 2017). It will facilitate the detection of selective (Damas et al. 2017) and Psittaciformes (Furo et al. 2018). In sweeps, and provide the physical localization of variants (Manel fact, in all or most species of Psittaciformes, chicken chromo- et al. 2016), thus giving insight into the potential genes involved somes 6 and 7, and 8 and 9 are fused (Furo et al. 2018; in adaptation. Furthermore, the genome will be an important Kretschmer et al. 2018)—the same chromosomes involved resource in the study of chromosomal rearrangements in birds. in the fusions discovered in the reed warbler genome. We can only speculate about the significance of this without Results and Discussion more data. Passeriformes, the sister group of Psittaciformes, exhibit much lower rates of interchromosomal rearrange- Genome Assembly and Genome Quality Evaluation ments, despite being a large, highly diverse order We generated 3,810,665 reads with PacBio, with an average (Kretschmer et al. 2021). There is still a large knowledge gap read length of 16 kb at 61 coverage. We further obtained in the cytogenetics of birds (Degrandi et al. 2020), and more 2 Genome Biol. Evol. 13(9) doi:10.1093/gbe/evab212 Advance Access publication 9 September 2021
Genome Assembly of the Reed Warbler GBE Table 1 Summary Statistics of the Reed Warbler Genome Assembly and Annotation Genome Assembly Estimated genome size 1.13 Gb Guanine and cytosine content 41.91% N50 length (contig) 13 Mb Longest contig 48 Mb Total length of contigs 1.07 Gb N50 length (scaffold) 74.44 Mb Longest scaffold 153.80 Mb Downloaded from https://academic.oup.com/gbe/article/13/9/evab212/6367782 by guest on 18 December 2021 Total length of scaffolds 1.08 Gb Complete BUSCOs 95.7% Transposable Elements Percent (%) Total length DNA 0.22 2.35 Mb LINE 4.11 44.2 Mb SINE 0.09 0.98 Mb LTR 4.50 48.4 Mb Unknown 0.55 5.9 Mb Other (satellites, simple repeats, 1.49 16 Mb low complexity) Total 10.94 117.6 Mb Protein-Coding Genes Predicted genes 14,645 Average coding sequence length 1,782 (bp) Average exon length (bp) 284 Average intron length (bp) 2918 Complete BUSCOs 97.5% research is needed to determine the rarity of the fusions we Sylvioidea, we found unequivocal evidence of two novel discovered in the reed warbler genome. macrochromosomal fusions in the reed warbler genome. We furthermore confirm the previously identified neo-sex Further research is needed to determine the evolutionary chromosome (Pala et al. 2012; Sigeman, Ponnikas, et al. age of these fusions, especially because they are not pre- 2020), a fusion between the ancestral chromosome Z and a sent in the garden warbler genome, suggesting they are part of chromosome 4A (according to chromosome naming relatively new. This genome will serve as an important from the zebra finch). This fusion is thought to have occurred resource to increase our knowledge of chromosomal rear- at the base of the Sylvioidea branch (Pala et al. 2012) and is rangements in birds, both their prevalence and their sig- shared with all species of Sylvioidea studied so far (Sigeman, nificance for avian evolution. Furthermore, the genome Ponnikas, et al. 2020). Figure 1 clearly shows that reed war- will, through the identification of genetic variants and in- bler chromosome Z corresponds to great tit chromosome Z, formation of the function of associated genes, provide a plus a part of great tit chromosome 4A, whereas reed warbler deeper insight into the evolution of the reed warbler, a chromosome Z corresponds to garden warbler chromosome Z bird which will continue to fascinate researchers for years (supplementary fig. 1, Supplementary Material online). to come. Conclusion Materials and Methods In this study, we present the first assembled and anno- tated genome for the reed warbler A. scirpaceus. We have Sampling and Isolation of Genomic DNA accomplished this through utilizing long read PacBio se- Blood was collected from a brachial vein of a female quencing, and scaffolding with paired-end 10 and Hi-C reed warbler (subspecies A. scirpaceus scirpaceus, reads. In addition to the previously identified autosome- NCBI Taxonomy ID: 126889) in Storminnet, Porvoo sex chromosome fusion shared by all members of (60 190 24.900 N, 25 350 23.000 E), Finland, on May 22, 2019. Genome Biol. Evol. 13(9) doi:10.1093/gbe/evab212 Advance Access publication 9 September 2021 3
Sætre et al. GBE Downloaded from https://academic.oup.com/gbe/article/13/9/evab212/6367782 by guest on 18 December 2021 FIG. 1.—Circos plot showing the synteny between the reed warbler (on the right side, denoted with the prefix as [Acrocephalus scirpaceus]) and the great tit (left side, prefix pm [Parus major]) genome assemblies. The reed warbler chromosome 6 is a fusion of great tit chromosomes 7 and 8, whereas reed warbler chromosome 8 is a fusion of great tit chromosomes 6 and 9 (see Hi-C contact maps in supplementary fig. 2, Supplementary Material online). The reed warbler chromosome Z corresponds to great tit chromosome Z, and a part of great tit chromosome 4A. Catching and sampling procedures complied with the Finnish Library Preparation and Sequencing law on animal experiments and permits were licensed by the DNA quality was checked using a combination of a fluoro- National Animal Experiment Board (ESAVI/3920/2018) and metric (Qubit, Invitrogen), UV absorbance (Nanodrop, Southwest Finland Regional Environment Centre (VARELY/ Thermo Fisher) and DNA fragment length assays (HS-50 kb 758/2018). Reed warblers were trapped with a mist net, fragment kit from AATI, now part of Agilent Inc.). The PacBio ringed and handled by E.K. under his ringing license. library was prepared using the Pacific Biosciences Express li- The blood (80 ml) was divided and stored separately in brary preparation protocol. DNA was fragmented to 35 kb. 500 ml ethanol, and in 500 ml SET buffer (0.15M NaCl, Size selection of the final library was performed using 0.05M Tris, 0.001M EDTA, pH 8.0). The samples were imme- BluePippin with a 15 kb cut-off. Six single-molecule real- diately placed in liquid nitrogen, and kept at 80 C when time (SMRT) cells were sequenced using Sequel Polymerase stored. We performed phenolchloroform DNA isolation on v3.0 and Sequencing chemistry v3.0 on a PacBio RS II instru- the sample stored in SET buffer, following a modified protocol ment. The 10 Genomics Chromium linked-read protocol from Sambrook et al. (1989). (10 Genomics Inc.) was used to prepare the 10 library, 4 Genome Biol. Evol. 13(9) doi:10.1093/gbe/evab212 Advance Access publication 9 September 2021
Genome Assembly of the Reed Warbler GBE and due to the reed warbler’s smaller sized genome, only resulting in 521 corrections (breaks, joins and removal of er- 0.7 ng/ml of high molecular weight DNA was used as input. roneously duplicated sequence). HiGlass (Kerpedjiev et al. A high-throughput chromosome conformation capture (Hi-C) 2018) and PretextView (https://github.com/wtsi-hpag/ library was constructed using 50 ml of blood, following step 10 PretextView, last accessed September 17, 2021) were used and onwards in the Arima Hi-C (Arima Genomics) library pro- to visualize and rearrange the genome using Hi-C data, and tocol for whole blood. Adaptor ligation with Unique dual PretextSnapshot (https://github.com/wtsi-hpag/ indexing (Illumina) were chosen to match the indexes from PretextSnapshot, last accessed September 17, 2021) was the 10 linked-read library for simultaneous paired-end se- used to generate an image of the Hi-C contact map. The quencing (150 bp) on the same lane on an Illumina HiSeq X corrections made reduced the total length of scaffolds by platform. Both libraries were quality controlled using a 0.5% and the scaffold count by 44.6%, and increased the Downloaded from https://academic.oup.com/gbe/article/13/9/evab212/6367782 by guest on 18 December 2021 Fragment analyzer NGS kit (AATI) and qPCR with the Kapa scaffold N50 by 20.2%. Curation identified and confirmed 29 library quantification kit (Roche) prior to sequencing. autosomes and the Z and W chromosomes, to which 98.6% The sequencing was provided by the Norwegian of the assembly sequences were assigned. Sequencing Centre (https://www.sequencing.uio.no, last accessed September 17, 2021), a national technology plat- Genome Quality Evaluation form hosted by the University of Oslo and supported by the “Functional Genomics” and “Infrastructure” programs of the We assessed the quality of the assembly with the assembla- Research Council of Norway and the South-Eastern Regional thon_stats.pl script (Bradnam et al. 2013) and investigated the Health Authorities. completeness of the genome with Benchmarking Universal Single-Copy Orthologs (BUSCO) v. 5.0.0 (Sim~ao et al. 2015), Genome Size Estimation and Genome Assembly searching for 8,338 universal avian single-copy orthologs (aves_odb10). The genome size of the reed warbler was estimated by a k- mer analysis of 10 reads using Jellyfish v. 2.3.0 (Marçais and Genome Annotation Kingsford 2011) and Genome Scope v. 1.0 (Vurture et al. 2017), with a k-mer size of 21. The estimated genome size We used a repeat library provided by Alexander Suh called was 1,130,626,830 bp. bird_library_25Oct2020 and described in Peona et al. (2020) We assembled the long-read PacBio sequencing data with to softmask repeats in the reed warbler genome assembly. FALCON and FALCON-Unzip (falcon-kit 1.5.2 and falcon- Softmasked genome assemblies for golden eagle (Aquila unzip 1.3.5) (Chin et al. 2016). Falcon was run with the fol- chrysaetos), chicken (Gallus gallus), great tit (Parus major), lowing parameters: length_cutoff ¼ 1; length_cutoff_pr ¼ Anna’s hummingbird (Calypte anna), zebra finch 1000; pa_HPCdaligner_option ¼ –v –B128 –M24; pa_da- (Taeniopygia guttata), great reed warbler (Acrocephalus arun- ligner_option ¼ –e0.8 –l2000 –k18 –h480 –w8 –s100; dinaceus), icterine warbler (Hippolais icterina), collared fly- ovlp_HPCdaligner_option ¼ –v –B128 –M24; ovlp_daligner_op- catcher (Ficedula albicollis), and New Caledonian crow tion ¼ –k24 –e.94 –l3000 –h1024 –s100; pa_DBsplit_option ¼ (Corvus moneduloides) were downloaded from NCBI. The tri- –x500 –s200; ovlp_DBsplit_option ¼ –x500 –s200; falcon_sen- angle subcommand from Mash v. 2.3 (Ondov et al. 2016) was se_option ¼ –output-multi –min-idt 0.70 –min-cov 3 –max-n- used to estimate a lower-triangular distance matrix, and a read 200; overlap_filtering_setting ¼ –max-diff 100 –max-cov Python script (https://github.com/marbl/Mash/issues/9#issue- 100 –min-cov 2. Falcon-unzip was run with default settings. The comment-509837201, last accessed September 17, 2021) purge_haplotigs pipeline v. 1.1.0 (Roach et al. 2018) was used to was used to convert the distance matrix into a full matrix. curate the diploid assembly, with -l5, -m35, -h190 for the contig The full matrix was used as input to RapidNJ v. 2.3.2 coverage, and -a60 for the purge pipeline. Next, we scaffolded (Simonsen et al. 2008) to create a guide tree based on the the curated assembly with the 10 reads using Scaff10X v. 4.1 neighbor-joining method. Cactus v. 1.3.0 (Armstrong et al. (https://github.com/wtsi-hpag/Scaff10X, last accessed 2020) was run with the guide tree and the softmasked ge- September 17, 2021), and the Hi-C reads using SALSA v. 2.2 nome assemblies as input. (Ghurye et al. 2017). Finally, we polished the assembly (com- We also downloaded the annotation for chicken, and used bined with the alternative assembly from Falcon-Unzip), first it as input to the Comparative Annotation Toolkit (CAT) v. with PacBio reads using pbmm2 v. 1.2.1, which uses minimap2 2.2.1-36-gfc1623d (Fiddes et al. 2018) together with the hi- (Li 2018) internally (v. 2.17), and then with 10 reads for two erarchical alignment format file from Cactus. Chicken was rounds with Long Ranger v. 2.2.2 (Marks et al. 2019) and used as reference genome, reed warbler as the target ge- FreeBayes v. 1.3.1 (Garrison and Marth 2012). nome and the AUGUSTUS (Stanke et al. 2008) species param- eter was set to “chicken.” InterProScan v. 5.34-73 (Jones et Curation al. 2014) was run on the predicted proteins to find functional The assembly was decontaminated and manually curated us- annotations, and DIAMOND v. 2.0.7 (Buchfink et al. 2021) ing the gEVAL browser (Chow et al. 2016; Howe et al. 2021), was used to compare the predicted proteins against Genome Biol. Evol. 13(9) doi:10.1093/gbe/evab212 Advance Access publication 9 September 2021 5
Sætre et al. GBE UniProtKB/Swiss-Prot release 2021_03 (The UniProt and can be accessed at https://doi.org/10.6084/m9.figshare. Consortium 2021). AGAT v. 0.5.1 (Dainat 2021) was used 16622302.v1. to generate statistics from the GFF3 file with annotations and to add functional annotations from InterProScan and Literature Cited gene names from UniProtKB/Swiss-Prot. BUSCO v. 5.0.0 Armstrong J, et al. 2020. Progressive Cactus is a multiple-genome aligner was used to assess the completeness of the annotation. for the thousand-genome era. Nature 587(7833):246–251. Both C, et al. 2010. Avian population consequences of climate change are most severe for long-distance migrants in seasonal habitats. Proc Biol Synteny Analysis Sci. 277(1685):1259–1266. Bradnam KR, et al. 2013. Assemblathon 2: evaluating de novo methods of We aligned the assembly against the great tit (Parus major) Downloaded from https://academic.oup.com/gbe/article/13/9/evab212/6367782 by guest on 18 December 2021 genome assembly in three vertebrate species. Gigascience 2(1):10. and the garden warbler (Sylvia borin) genome assemblies with Brommer JE, Lehikoinen A, Valkama J. 2012. The breeding ranges of minimap2 v. 2.18-r1015 and extracted only alignments lon- central European and arctic bird species move poleward. PLoS One ger than 5,000 bp. The bundlelinks from circos-tools v. 0.23 7(9):e43648. was used to merge neighboring links using default options Buchfink B, Reuter K, Drost HG. 2021. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nat Methods and a plot was created using circos v. 0.69-8. 18(4):366–368. Chamorro D, Nieto I, Real R, Mun ~oz AR. 2019. Wintering areas on the move in the face of warmer winters. Ornis Fennica 96:41–54. Supplementary Material Chin CS, et al. 2016. Phased diploid genome assembly with single mole- Supplementary data are available at Genome Biology and cule real-time sequencing. Nat Methods 13(12):1050–1054. Evolution online. Chow W, et al. 2016. gEVAL—a web-based browser for evaluating ge- nome assemblies. Bioinformatics 32(16):2508–2510. Dainat J. 2021. AGAT: Another Gff Analysis Toolkit to Handle Annotations in Any GTF/GFF Format (Version v0.5.1). Zenodo. Available from: Acknowledgments https://www.doi.org/10.5281/zenodo.3552717. Accessed September 17, 2021. We would like to thank Marjo Saastamoinen, Suvi Sallinen, Damas J, et al. 2017. Upgrading short-read animal genome assemblies to Paolo Momigliano, and the Molecular Ecology and chromosome level using comparative genomics and a universal probe Systematics laboratory in the University of Helsinki for facili- set. Genome Res. 27(5):875–884. tating DNA extraction. We would like to thank Ave Tooming- Davies NB, Brooke ML. 1989. An experimental study of co-evolution be- Klunderud and the Norwegian Sequencing Centre for per- tween the cuckoo, Cuculus canorus, and its hosts. I. Host egg discrim- ination. J Anim Ecol. 58(1):207–224. forming the sequencing. We also thank Pasi Rastas from Degrandi TM, et al. 2020. Introducing the bird chromosome database: an the HiLIFE BioData Analytics Service unit, for his assistance overview of cytogenetic studies in birds. Cytogenet Genome Res. with preliminary analyses. The computations were performed 160(4):199–205. on resources provided by UNINETT Sigma2 - the National Eglington SM, et al. 2015. Latitudinal gradients in the productivity of Infrastructure for High Performance Computing and Data European migrant warblers have not shifted northwards during a pe- riod of climate change. Global Ecol Biogeogr. 24(4):427–436. Storage in Norway. This work was supported by Research Ellegren H. 2010. Evolutionary stasis: the stable chromosomes of birds. Council of Norway by grant numbers 251076 and 300032 Trends Ecol Evol. 25(5):283–291. to K.S.J. and a HiLIFE Start-up grant and a University of Feng S, et al. 2020. Dense sampling of bird diversity increases power of Helsinki Faculty of Biological and Environmental Sciences comparative genomics. Nature 587(7833):252–257. travel grant to R.T. Fiddes IT, et al. 2018. Comparative Annotation Toolkit (CAT) – simulta- neous clade and personal genome annotation. Genome Res. 28(7):1029–1038. Author Contributions Furo IDO, et al. 2018. Chromosome painting in Neotropical long- and short-tailed parrots (Aves, Psittaciformes): phylogeny and proposal C.L.C.S., F.E., K.R., K.S.J., O.K.T., and R.T. designed the re- for a putative ancestral karyotype for tribe Arini. Genes 9: 491. search. E.K., K.R., and R.T. collected the sample. K.R. Available from: 10.3390/genes9100491. Accessed September 17, extracted DNA. C.L.C.S. and O.K.T. performed the research 2021. Garrison E, Marth G. 2012. Haplotype-based variant detection from short- and/or analyzed the data. A.T., J.T., K.H., S.P., and W.C. cu- read sequencing. arXiv Preprint. Available from: https://arxiv.org/abs/ rated the assembly. C.L.C.S. drafted the manuscript. All 1207.3907. Accessed September 17, 2021. authors read and approved the final manuscript. Ghurye J, Pop M, Koren S, Bickhart D, Chin CS. 2017. Scaffolding of long read assemblies using long range contact information. BMC Genomics 18(1):527. Data Availability Halupka L, Dyrcz A, Borowiec M. 2008. Climate change affects breeding of reed warblers Acrocephalus scirpaceus. J Avian Biol. 39(1):95–100. The reference genome of Acrocephalus scirpaceus (bAcrSci1), Howe K, et al. 2021. Significantly improving the quality of genome as- and the raw sequence data, have been deposited in the semblies through curation. GigaScience 10(1):giaa153. European Nucleotide Archive under the BioProject number J€ arvinen O, Ulfstrand S. 1980. Species turnover of a continental bird fauna: PRJEB45715. Genome annotations are available at Figshare northern Europe, 1850-1970. Oecologia 46(2):186–195. 6 Genome Biol. Evol. 13(9) doi:10.1093/gbe/evab212 Advance Access publication 9 September 2021
Genome Assembly of the Reed Warbler GBE Jones P, et al. 2014. InterProScan 5: genome-scale protein function clas- Sambrook J, Fritsch EF, Maniatis T. 1989. Molecular cloning: a laboratory sification. Bioinformatics 30(9):1236–1240. manual, 2nd ed. New York: Cold Spring, Harbor Laboratory Press. Kerpedjiev P, et al. 2018. HiGlass: web-based visual exploration and anal- Schaefer T, Ledebur G, Beier J, Leisler B. 2006. Reproductive responses of ysis of genome interaction maps. Genome Biol. 19(1):125. two related coexisting songbird species to environmental changes: Kralj J, Prochazka P, Fainov a D, Patzenhauerov a H, Tutis V. 2010. global warming, competition, and population sizes. J Ornithol. Intraspecific variation in the wing shape and genetic differentiation 147(1):47–56. of reed warblers Acrocephalus scirpaceus in Croatia. Acta Ornithol. Shafer ABA, et al. 2017. Bioinformatic processing of RAD-seq data dra- 45(1):51–58. matically impacts downstream population genetic inference. Methods Kretschmer R, Ferguson-Smith MA, de Oliveira EHC. 2018. Karyotype Ecol Evol. 8(8):907–917. evolution in birds: from conventional staining to chromosome paint- Sigeman H, Strandh, M, et al. 2020. Genomics of an avian neo-sex chro- ing. Genes 9:181. mosome reveals the evolutionary dynamics of recombination suppres- Kretschmer R, et al. 2021. Karyotype evolution and genomic organization sion and sex-linked genes. bioRxiv Preprint, Available from: 10.1101/ Downloaded from https://academic.oup.com/gbe/article/13/9/evab212/6367782 by guest on 18 December 2021 of repetitive DNAs in the saffron finch, Sicalis flaveola (Passeriformes, 2020.09.25.314088. Accessed September 17, 2021. Aves). Animals 11(5):1456. Available from: 10.3390/ani11051456. Sigeman H, Ponnikas S, Hansson B. 2020. Whole-genome analysis across Accessed September 17, 2021. 10 songbird families within Sylvioidea reveals a novel autosome-sex Laine VN, Great Tit HapMap Consortium, et al. 2016. Evolutionary signals chromosome fusion. Biol Lett. 16(4):20200082. of selection on cognition from the great tit genome and methylome. Sim~ao FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. Nat Commun. 7:10474. 2015. BUSCO: assessing genome assembly and annotation Li H. 2018. Minimap2: pairwise alignment for nucleotide sequences. completeness with single-copy orthologs. Bioinformatics. Bioinformatics 34(18):3094–3100. 31(19):3210–3212. Manel S, et al. 2016. Genomic resources and their influence on the de- Simonsen M, Mailund T, Pedersen CNS. 2008. Rapid neighbour-joining. In: tection of the signal of positive selection in genome scans. Mol Ecol. Crandall KA, Lagergren J, editors. Algorithms in bioinformatics. WABI 25(1):170–184. 2008. Lecture Notes in Computer Science. Vol 5251. Berlin, Marçais G, Kingsford C. 2011. A fast, lock-free approach for efficient Heidelberg (Germany): Springer. p. 113–122. Available from: parallel counting of occurrences of k-mers. Bioinformatics 10.1007/978-3-540-87361-7_10. Accessed September 17, 2021. 27(6):764–770. Skinner BM, Griffin DK. 2012. Intrachromosomal rearrangements in avian Marks P, et al. 2019. Resolving the full spectrum of human genome var- genome evolution: evidence for regions prone to breakpoints. iation using linked-reads. Genome Res. 29(4):635–645. Heredity (Edinb). 108(1):37–41. Meller K, Piha M, V€ah€atalo AV, Lehikoinen A. 2018. A positive relationship Stanke M, Diekhans M, Baertsch R, Haussler D. 2008. Using native and between spring temperature and productivity in 20 songbird species in syntenically mapped cDNA alignments to improve de novo gene find- the boreal zone. Oecologia 186(3):883–893. ing. Bioinformatics 24(5):637–644. Ondov BD, et al. 2016. Mash: fast genome and metagenome distance Stokke B, et al. 2018. Characteristics determining host suitability for a estimation using MinHash. Genome Biol. 17(1):132. generalist parasite. Sci Rep. 8(1):6285. Pala I, et al. 2012. Evidence of a neo-sex chromosome in birds. Heredity Stolt BO. 1999. The Swedish reed warbler Acrocephalus scirpaceus pop- (Edinb). 108(3):264–272. ulation estimated by a capture-recapture technique. Ornis Svecica Peona V, et al. 2020. The avian W chromosome is a refugium for endog- 9:35–46. enous retroviruses with likely effects on female-biased mutational load Sætre CL, et al. 2017. Rapid adaptive phenotypic change following colo- and genetic incompatibilities. Philos Trans R Soc B. 376:20200186. nization of a newly restored habitat. Nat Commun. 8:14159. Prochazka P, et al. 2011. Low genetic differentiation among reed warbler The UniProt Consortium. 2021. UniProt: the universal protein knowledge- Acrocephalus scirpaceus populations across Europe. J Avian Biol. base in 2021. Nucleic Acids Res. 49:D480–D489. 42(2):103–113. Thorogood R, Davies NB. 2013. Reed warbler hosts fine-tune their defenses Roach MJ, Schmidt SA, Borneman AR. 2018. Purge Haplotigs: allelic contig to track three decades of cuckoo decline. Evolution 67(12):3545–3555. reassignment for third-gen diploid genome assemblies. BMC Thorogood R, Spottiswoode CN, Portugal SJ, Gloag R. 2019. The coevo- Bioinformatics 19(1):460. lutionary biology of brood parasitism: a call for integration. Philos Trans Røed T. 1994. “Rørsanger.” In Norsk Fugleatlas, 382–383. Norsk R Soc B. 374:20180190. Ornitologisk Forening. Vickery JA, et al. 2014. The decline of Afro-Palaearctic migrants and an Salewski V, Hochachka WM, Fiedler W. 2010. Global warming and assessment of potential causes. Ibis 156(1):1–22. Bergmann’s rule: do central European passerines adjust their body Vurture GW, et al. 2017. GenomeScope: fast reference-free genome pro- size to rising temperatures? Oecologia 162(1):247–260. filing from short reads. Bioinformatics 33(14):2202–2204. Associate editor: Bonnie Fraser Genome Biol. Evol. 13(9) doi:10.1093/gbe/evab212 Advance Access publication 9 September 2021 7
You can also read