Divergent paralogues of ribosomal DNA in eucalypts (Myrtaceae)
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Molecular Phylogenetics and Evolution 44 (2007) 346–356 www.elsevier.com/locate/ympev Divergent paralogues of ribosomal DNA in eucalypts (Myrtaceae) Michael J. Bayly ¤, Pauline Y. Ladiges School of Botany, The University of Melbourne, Vic., 3010, Australia Received 10 August 2006; revised 9 October 2006; accepted 23 October 2006 Available online 7 November 2006 Abstract The presence of divergent paralogues of nuclear ribosomal DNA, from the 18S–5.8S–26S cistron, is reported in members of Eucalyp- tus subg. Eucalyptus. These paralogues, which include non-functional pseudogenes, probably diverged prior to the diVerentiation of spe- cies groups in subg. Eucalyptus. When compared with presumably functional sequences, the pseudogenes show greater sequence variation between species, particularly in the 5.8S gene. They are also characterised by reduced GC content, associated with a reduced number of CpG and CpNpG methylation sites, and an increase in the inferred number of methylation-induced substitutions. Some pseudogenes also lack motifs that are usually conserved in plants, both in ITS1 and the 5.8S gene. Two main lineages of pseudogenes are identiWed, one iso- lated from a group of western Australian species, one from a group of eastern Australian species. It is not clear whether these two lineages of pseudogenes are orthologous, or represent independent divergences from functional sequence types. The presence of divergent rDNA paralogues highlights the need for caution when interpreting eucalypt phylogenies based on ITS sequences. © 2006 Elsevier Inc. All rights reserved. Keywords: Angophora; Australia; Corymbia; Eucalyptus; Internal transcribed spacer (ITS); Methylation; Myrtaceae; nrDNA; Paralogy; Phylogeny; Pseudogenes 1. Introduction years, divergent rDNA paralogues within single genomes have been reported from an increasing range of plant and The internal transcribed spacer (ITS) regions of ribo- fungal groups (e.g., Buckler et al., 1997; Lieckfeldt and Seif- somal DNA (rDNA) are used extensively in phylogenetic ert, 2000; Hartmann et al., 2001; Mayol and Roselló, 2001; studies of Xowering plants, especially for analyses of rela- Muir et al., 2001; Bailey et al., 2003; RazaWmandimbison tionships within genera, or among closely related genera. et al., 2004; Won and Renner, 2005; Álvarez and Wendel, Plant genomes include multiple copies (paralogues) of these 2003). The presence of such paralogues can, if undetected, rDNA regions, in the order of thousands of copies per cell, confound attempts at phylogenetic reconstruction (Sander- arranged in one to several arrays of tandem repeats. Rela- son and Doyle, 1992). tive homogeneity of sequences across multiple copies of Prominent among divergent rDNA paralogues are non- rDNA is maintained through concerted evolution, which functional pseudogenes. These rDNA copies, freed from involves such processes as gene conversion and unequal functional constraints, are generally characterised by crossing over (Arnheim, 1983), and which recent studies increased substitution rates in conserved regions and an have shown to be extremely eYcient and rapid in some taxa increase in the number of methylation-induced substitu- (Kovarik et al., 2005). Divergence of rDNA copies will, tions, which lead, in turn, to reduced GC content and however, occur where concerted evolution is slow, or does reduced stability of secondary structure (Buckler et al., not act on some copies, e.g., if copies are dispersed to a 1997; Bailey et al., 2003). Methylation-induced substitu- diVerent part of the genome (Childs et al., 1981). In recent tions occur because cytosines, methylated to 5-methylcyto- sine, frequently mutate to thymine by deamination * Corresponding author. Fax: +61 3 9347 5460. (Vairapandi and Duker, 1994; Ng and Bird, 1999). DNA E-mail address: mbayly@unimelb.edu.au (M.J. Bayly). methylation in plants and, therefore, these kinds of 1055-7903/$ - see front matter © 2006 Elsevier Inc. All rights reserved. doi:10.1016/j.ympev.2006.10.027
M.J. Bayly, P.Y. Ladiges / Molecular Phylogenetics and Evolution 44 (2007) 346–356 347 of Brooker, 2000) and E. triXora (see Brooker, 2000; for authorities of taxon names). These were collected at Cur- rency Creek Arboretum, South Australia (Nicolle, 2003), and dried in silica gel; details of collections and voucher specimens are given in Table 1. Genomic DNA was isolated from dried samples using a Qiagen DNeasy Plant Mini Kit (Qiagen, Germany), according to the manufacturer’s instructions, including additional centrifugation between steps four and Wve, and a Wnal elution volume of 100 l. PCR mixture included 200 M of each dNTP, 1–3 l of DNA extract, 1.25 U Hot- StarTaq DNA polymerase with 2.5 l of its accompanying 10£ PCR buVer (Qiagen, Germany; with the Wnal Mg2+ concentration adjusted to 3 mM), and were made up to vol- ume (25 l) with ultrapure water. One set of reactions included 0.4 M of each of the primers ITS5 and ITS4 (White et al., 1990) and the following cycling conditions: one hold at 95 °C for 15 min, 35 cycles of 94 °C for 1 min, Fig. 1. Distribution of Eucalyptus subg. Eucalyptus. Only one species, E. 48 °C for 1 min, and 72 °C for 1 min, with an additional diversifolia, occurs in both western and eastern Australia, all others are 5 min at 72 °C at the end of the last cycle. Another set of endemic to one region or the other (25 species in the west, 84 in the east, reactions included 0.6 M of each of the primers S3 and S5 following the classiWcation of Brooker, 2000). (Käss and Wink, 1997) and the following cycling condi- tions: one hold at 95°C for 15 min, 32 cycles of 94 °C for mutations, occur chieXy at CpG and CpNpG sites (where 30 s, 55 °C for 30 s, and 72 °C for 20 s, with an additional N is any nucleotide; Gardiner-Garden et al., 1992; Bender, 5 min at 72 °C at the end of the last cycle. These reactions 2003). In contrast to pseudogenes, functional copies of were performed in a Mastercycler Gradient thermal cycler rDNA maintain large numbers of CpG and CpNpG sites, (Eppendorf). PCR products were puriWed using a QIA- presumably because of functional constraints. quick PCR PuriWcation Kit (Qiagen, Germany). PuriWed In this paper, we report the occurrence of divergent par- DNA was directly sequenced using a BigDye Terminator alogues of rDNA in Eucalyptus L’Hér. (Myrtaceae), the v3.1 Cycle Sequencing Kit (Applied Biosystems, USA). dominant tree genus over much of the Australian continent. Sequencing reactions were analysed on an ABI 3730xl 96- These paralogues, which are probably non-functional pseu- capillary automated DNA sequencer, at the Australian dogenes, were discovered while trialling diVerent primer Genome Research Facility, Brisbane. combinations for direct sequencing of rDNA for a phyloge- netic study of the monocalypt eucalypts, Eucalyptus subg. 2.2. Sequence alignment and comparison Eucalyptus, a group of 110 species (sensu Brooker, 2000) distributed in eastern and south-western Australia (Fig. 1). Contiguous sequences were assembled using Sequencher Properties of the divergent paralogues are described. Their v. 3.0 (Gene Codes Corporation, USA). For initial compar- phylogenetic histories are considered by comparison with isons, sequences from the eight eucalypt samples were previously published sequences of the eucalypt group (Ste- aligned manually using Se-Al Sequence Alignment Editor ane et al., 1999, 2002; Udovicic and Ladiges, 2000), which, v. 2.0a11 (Rambaut, 1996); this alignment was straightfor- in the broad sense, includes seven genera (Allosyncarpia ward. For subsequent phylogenetic analyses, sequences S.T.Blake, Angophora Cav., Arillastrum Pancher ex Baill., were added manually to the alignment of Steane et al. Corymbia K.D. Hill and L.A.S. Johnson, Eucalyptopsis C.T. (2002). Individual sequences are available from GenBank White, Eucalyptus, Stockwellia D.J. Carr, S.G.M. Carr and (accession numbers are shown in Table 1), and alignments B. Hyland) from Australia, Timor, New Guinea, New Brit- are lodged in TreeBASE. Boundaries between ITS1 and ain, Sulawesi, Ceram, Mindanao and New Caledonia both the 18S and 5.8S genes were identiWed by reference to (Ladiges et al., 2003). sequences from other published studies (e.g., Baldwin, 1993; Steane et al., 1999); notwithstanding that there is some 2. Materials and methods inconsistency, for various reasons (Hershkovitz and Lewis, 1996), in the interpretation of boundaries (especially 2.1. Isolation, ampliWcation and sequencing of DNA between ITS1/5.8S) among GenBank accessions. Individual sequences from the eight eucalypt samples Leaf samples from eight species of Eucalyptus subg. were compared with respect to length, GC content, second- Eucalyptus were used in this study: E. acies, E. globoidea, E. ary structure stability, and the presence of conserved insularis, E. lacrimans, E. paliformis, E. sepulcralis, E. spec- motifs, all of which have been shown to vary between tatrix (a taxonomic synonym of E. stricta in the treatment functional and pseudogenic rDNA (Buckler et al., 1997).
348 M.J. Bayly, P.Y. Ladiges / Molecular Phylogenetics and Evolution 44 (2007) 346–356 Table 1 Table 1 (continued) Sources of plant material and DNA sequences Species GenBank number(s) Locality or reference Species GenBank number(s) Locality or reference (“typical” followed by (“typical” followed by “divergent” sequences) “divergent” sequences) E. tenuiramis 1a AF058500 Steane et al. (1999) Eucalyptus E. tenuiramis 2a AF058491 Steane et al. (1999) E. acies EF051489, EF051490 CCA 42/25, ex E. tindaliaea AF390534 Steane et al. (2002) Mermaid Point, WA E. triXora EF051503, EF051504 CCA 34/14, ex Nerriga E. amygdalina 1a AF058494 Steane et al. (1999) NSW E. amygdalina 2 AF058496 Steane et al. (1999) E. umbra AF058505 Steane et al. (1999) E. balladoniensis AF390504 Steane et al. (2002) E. willisii subsp. AF058498 Steane et al. (1999) E. brachyandra AF390517 Steane et al. (2002) falciformisa E. brevistylis AF390527 Steane et al. (2002) E. willisii subsp. AF058499 Steane et al. (1999) E. camaldulensis AF058473 Steane et al. (1999) willisiia E. cloeziana 1 AF058462 Steane et al. (1999) Corymbia E. coccifera 1a AF058502 Steane et al. (1999) C. eximia AF390464 Steane et al. (2002) E. coccifera 2 AF058501 Steane et al. (1999) C. haematoxylon AF390456 Steane et al. (2002) E. coccifera 3a AF058504 Steane et al. (1999) C. henryi AF390457 Steane et al. (2002) E. croajingolensisa AF058497 Steane et al. (1999) C. maculata AF058461 Steane et al. (1999) E. curtisii 1 AF390524 Steane et al. (2002) C. tessellaris AF058457 Steane et al. (1999) E. curtisii 2 AF390525 Steane et al. (2002) C. trachyphloia AF390455 Steane et al. (2002) E. deglupta 1 AF390518 Steane et al. (2002) E. delegatensisb AF058480 Steane et al. (1999) Angophora E. diversicolor AF390493 Steane et al. (2002) A. bakeri AF058456 Steane et al. (1999) E. dives AF058503 Steane et al. (1999) A. costata AF058455 Steane et al. (1999) E. elataa AF058486 Steane et al. (1999) A. melanoxylon AF390450 Steane et al. (2002) E. erythrocorys AF058458 Steane et al. (1999) Eucalyptopsis group E. eudesmoides AF390468 Steane et al. (2002) Allosyncarpia AF190353 Udovicic and Ladiges E. gamophylla AF390469 Steane et al. (2002) ternata (2000) E. globoidea EF051491, EF051492 CCA 146/19, ex Eucalyptopsis AF190354 Udovicic and Ladiges Stratford, Vic papuana (2000) E. grandis 1 AF058475 Steane et al. (1999) Stockwellia AF058452 Steane et al. (1999) E. guilfoylei AF390511 Steane et al. (2002) quadriWda E. hallii AF390512 Steane et al. (2002) E. insularis EF051493, EF051494 CCA 134/28, Cape Le Outgroup Grand, WA Arillastrum AF058454 Steane et al. (1999) E. jacksonii c AF390529 Steane et al. (2002) gummiferum E. lacrimans EF051495, EF051496 CCA 59/10, ex Voucher specimens for new sequences are held in the herbarium of The Adaminaby, NSW University of Melbourne (MELU). Abbreviations are as follows: CCA, E. lansdowneana AF058476 Steane et al. (1999) Currency Creek Arboretum (with numbers indicating the row/plant num- E. latisinensis AF390532 Steane et al. (2002) bers of trees in the arboretum; Nicolle, 2003); NSW, New South Wales; E. leucophloia AF390470 Steane et al. (2002) Vic., Victoria; WA, Western Australia. Authorities for species names are E. marginatac AF390530 Steane et al. (2002) given by Chippendale (1988; Angophora), Hill and Johnson (1995; Corym- E. megacarpa AF390528 Steane et al. (2002) bia), and Brooker (2000; Eucalyptus). E. microcorys AF390516 Steane et al. (2002) a Combined, for the purpose of analysis, into a single terminal taxon, E. nitens 2 AF058472 Steane et al. (1999) EUC1. E. nitidab AF058481 Steane et al. (1999) b Combined, for the purpose of analysis, into a single terminal taxon, E. obliqua AF058484 Steane et al. (1999) EUC2. E. pachyphylla AF390473 Steane et al. (2002) c Combined, for the purpose of analysis, into a single terminal taxon, E. paliformis EF051497, EF051498 CCA 56/6, ex EUC3. Wadbilliga Trig, NSW E. pauciXora AF058489 Steane et al. (1999) E. pilularis AF390533 Steane et al. (2002) E. piperitaa AF058485 Steane et al. (1999) Minimum energy secondary structures were estimated, at E. pulchella 1 AF058487 Steane et al. (1999) 37 °C, for RNA transcripts of ITS1 using mfold version 3.2 E. pulchella 2a AF058490 Steane et al. (1999) E. radiata AF058482 Steane et al. (1999) (Zuker, 2003; available at http://www.bioinfo.rpi.edu/appli- E. regnans AF058488 Steane et al. (1999) cations/mfold/), and the associated minimum free energy E. risdoniia AF058493 Steane et al. (1999) values (G) were used for comparing structural stabilities E. rubiginosa AF390526 Steane et al. (2002) of sequences. Sequences were examined for the presence of E. sieberi AF058495 Steane et al. (1999) three motifs previously identiWed as highly conserved in E. sepulcralis EF051499, EF051500 CCA 62/35, ex Eyre Range, WA Xowering plant rDNA. These were: 5⬘-CAAGGAA in ITS1 E. spectatrix EF051501, EF051502 CCA 141/19, ex Bega, (Liu and Schardl, 1994); 5⬘-GAATTGCAGAATC in the NSW 5.8S gene (Jobes and Thien, 1997); an EcoRV restriction E. staeric AF390531 Steane et al. (2002) site (GATATC) near the 5⬘-end of the 5.8S gene (Liston E. tenuipes AF390523 Steane et al. (2002) et al., 1996).
M.J. Bayly, P.Y. Ladiges / Molecular Phylogenetics and Evolution 44 (2007) 346–356 349 After initial comparisons, sequences from the eight euca- 3. Results lypt samples were grouped into three classes. Further com- parisons were made between these sequence classes in terms 3.1. PCR results and paralogue characteristics of the percentage of nucleotide variation (in both ITS1 and the 5.8S gene) and the proportions of potentially methyla- The primer pairs ITS5/ITS4 and S3/S5 each yielded sin- tion-induced substitutions. gle PCR products, seen as single bands on agarose gels, and direct sequencing of these products was unproblematic. 2.3. Phylogenetic analyses Products ampliWed with ITS5/ITS4 spanned the region ITS1–5.8S–ITS2; those ampliWed with S3/S5 included only Sequence variation among paralogous rDNA sequences ITS1 and part of the 5.8S gene (358 aligned bases in total). from the eight eucalypt samples was assessed using trees Based on patterns of sequence similarity in ITS1 and based on maximum parsimony analysis. The aligned data- the 5.8S gene (Fig. 2), three classes of sequence were set for this analysis included ITS1 (245 bases) and part of recognised among the eight eucalypt samples, i.e., “typi- the 5.8S gene (113 bases). Presence/absence of a two base cal,” Western Australian “divergent,” and eastern Austra- gap, shared by Wve sequences, was scored as a separate lian “divergent”. “Typical” sequences were ampliWed binary character; the gap characters in the alignment were using the primer pair ITS5/ITS4, and most closely match treated as missing data (there being variation among other other sequences in GenBank from Eucalyptus subg. Euca- sequences at one of the two sites). Analyses were conducted lyptus. “Divergent” sequences, ampliWed using the primer using PAUP* 4.0 10 (SwoVord, 2000). Starting trees were pair S3/S5, have more unusual sequences, and fall into obtained by a stepwise addition sequence using the CLOS- clear geographic groups that contain taxa from either EST option (retaining one tree at each step), and then sub- eastern or western Australia. “Divergent” sequences were jected to TBR branch swapping, with the MULPARS selectively ampliWed using primers S3/S5, despite the fact option on. A strict consensus was calculated from the set of that “typical” sequences also include the priming sites for equally most parsimonious trees. Branch lengths and char- these oligonucleotides. It is not known if the ITS5 and acter state changes were calculated for one of the equally ITS4 priming sites are present in “divergent” sequences, parsimonious trees using DELTRAN character state opti- since these sites are external to the recovered sequences. misation. Consistency indices (CI) and retention indices The distribution of sequence variation within and (RI) were calculated with autapomorphic characters among the three paralogue classes is illustrated in Fig. 3. excluded. Despite substantial variation in sequence composition The phylogenetic relationships of rDNA sequences from between the three classes, there was little variation in the eight eucalypt samples were analysed in conjunction length; the only indel among the aligned sequences being with a subset of sequences from previous studies (Table 1), a two base gap shared by all members of the eastern Aus- representing all eucalypt genera and subgenera (sensu tralian “divergent” group. The very high CI (0.93) for Brooker, 2000). The aligned dataset spanned ITS1, the 5.8S trees comparing sequence classes (Fig. 2), indicates that gene and ITS2; sequences for the eight putative pseudo- there is little homoplasy in the data and, therefore, no evi- genes covered only part of this region, lacking the 3⬘-end of dence of recent recombination between the diVerent para- 5.8S and all of ITS2 (which were scored as missing data). logues. Some taxa from previous studies with identical sequences When compared with “typical” sequences, the “diver- (or sequences diVering only by autapomorphies) are repre- gent” types, both individually and collectively (Fig. 4), sented in the analyses by single terminal taxa, i.e., EUC1, have a lower G/C content and lower secondary structure EUC2 and EUC3 (see Table 1 and caption to Fig. 5 for stabilities, as indicated by estimated free energy values. details). Excluded from analysis were the six regions of Concomitant with this diVerence in G/C content, are ambiguous alignment previously identiWed by Steane et al. diVerences in both the number of standard (CpG and (2002). Multi-base indels were coded as separate binary CpNpG) methylation sites in divergent sequences characters, and single base gaps were treated as a Wfth char- (Fig. 4B), and the inferred number of C ! T substitutions acter state. At some alignment positions nucleotide varia- (on both strands) that are potentially methylation- tion overlapped with multi-base indels; in these cases the induced (Fig. 2; Table 2). DiVerences between “typical” character data for the positions were also included in analy- and “divergent” sequences that do not potentially relate ses (in addition to a binary character for the indel), but with to C ! T substitutions are few, accounting for 10.4% of gaps scored as missing data. Parsimony analysis used the inferred character state changes in Fig. 2. methods outlined in the previous paragraph. In addition, Sequence variation across the ITS1–5.8S region, indi- support for nodes was tested by bootstrap analysis, using cated by the percentage of sites that vary between taxa, is 1,000 HEURISTIC replicates (with MAXTREES set at greater among “divergent” than “typical” sequence types, 2000, trees built using the CLOSEST option for stepwise i.e., 3.4% of sites vary between the three taxa of the west- addition, and TBR branch swapping). Trees were rooted on ern “divergent” group, 3.7% between Wve taxa of the east- the branch connecting Arillastrum to the rest of the tree ern “divergent” group, and 2.2% between the eight (Steane et al., 2002; Parra-O et al., 2006). “typical” sequences. Among the “typical” sequences,
350 M.J. Bayly, P.Y. Ladiges / Molecular Phylogenetics and Evolution 44 (2007) 346–356 Fig. 2. Tree showing patterns of sequence divergence among paralogous rDNA sequences from eight eucalypt samples (based on an alignment of 358 bases, including ITS1 and part of the 5.8S gene). This is one of eight equally short trees (77 steps; CI 0.93; RI 0.98) derived from parsimony analysis; the circled branch was not present on the strict consensus. Branch lengths are shown, with values in brackets indicating the number of inferred C ! T substi- tutions on both DNA strands, followed by the number of substitutions that are potentially methylation-induced (at CpG and CpNpG sites), using DEL- TRAN character optimisation, with the tree rooted on the branch connecting “typical” sequences to the rest of the tree. Character state changes across the root of the tree are ambiguous; for this branch, numbers in brackets indicate the maximum number of C ! T and methylation-induced substitutions that could be inferred, with directions indicated by arrows. Fig. 3. Comparison of sequence variation among the three paralogue classes identiWed from the eight taxa in this study. Variation among “typical” sequences is restricted to the ITS1, with no variation in the 5.8S gene. The regions of ambiguous alignment in ITS1 are those previously identiWed by Ste- ane et al. (2002); they are ambiguous only when other eucalypt groups are included (not among these paralogues from subg. Eucalyptus). variation is conWned to the ITS regions (a presumably coding region (Table 2, Fig. 3), in proportions that would non-random pattern of variation). In comparison, “diver- be expected by chance, if variation in coding and non-cod- gent” sequences show variation in both ITS1 and the 5.8S ing sites was equally likely.
M.J. Bayly, P.Y. Ladiges / Molecular Phylogenetics and Evolution 44 (2007) 346–356 351 this study to show a mutation in any of these conserved regions. They have a lower GC content (Fig. 4) and, associ- ated with this, fewer methylation sites and lower secondary structure stabilities (free energy values) when compared with both “typical” and eastern Australian “divergent” sequences. The long branch (27 steps) supporting the west- ern divergent group in Fig. 2 also highlights the substantial sequence diVerences between this group and the members of the eastern Australian “divergent” group. 3.2. Phylogenetic relationships Analysis of 16 sequences from this study, together with a subset of 66 sequences from previous studies (Steane et al., 1999, 2002; Udovicic and Ladiges, 2000), produced 24,522 equally parsimonious trees (CI 0.54, RI 0.84), one of which is shown in Fig. 5. The broad relationships of Eucalyptus and other eucalypt genera, Corymbia, Angophora, Arilla- strum, Eucalyptopsis, Stockwellia and Allosyncarpia, do not diVer from those recovered in other studies based on ITS sequences, and they are not discussed here in detail. In this analysis, the “typical” sequences are placed with sequences of Eucalyptus subg. Eucalyptus from previous studies (Fig. 5). “Typical” sequences from Western Austra- lian species (E. acies, E. insularis and E. sepulcralis) fall within the previously established grade of western taxa, while those from eastern Australia (E. globoidea, E. lacri- mans, E. paliformis, E. spectatrix, E. triXora) fall within a clade of eastern taxa. Bootstrap support for the monophyly of subg. Eucalyptus1 is weak (50%), but there is slightly stronger support (65%) for the monophyly of the eastern clade, as also found in previous studies. The position of “divergent” sequences in this analysis is determined on the basis of partial sequence data, i.e., including only ITS1 and a partial 5.8S sequence (57% of the total alignment length). The sequences are, however, clearly placed in the Eucalyptus clade with strong (96%) bootstrap support. The western “divergent” and eastern “divergent” sequences each form groups with strong bootstrap support (100 and 99%, respectively). These groups are shown, together, as monophyletic (Fig. 5, node B), but with
352 M.J. Bayly, P.Y. Ladiges / Molecular Phylogenetics and Evolution 44 (2007) 346–356 Table 2 Properties of rDNA paralogues Paralogue class % Variable sites Conserved sitesc Methylation-induced substitutionsa b ITS1 5.8S p % of C ! T, substitutions % of total substitutions “typical” 3.3 0 0.048 Yes 28.6 25.0 W. Aus. “divergent” 3.3 3.5 0.233 No 90.0 75.0 E. Aus. “divergent” 3.7 3.5 0.24 Yes/nod 91.7 84.6 a Based on character state transformation on Fig. 2. b Binomial probability, given the observed number of variable sites, that they would be distributed between ITS1 and the 5.8S gene in the observed pro- portions, if variation at all sites is equally likely. c These include three motifs that are highly conserved in plants: CAAGGAA in ITS1 (Liu and Schardl, 1994); GAATTGCAGAATC in 5.8S (Jobes and Thien, 1997); an EcoRV site (GATATC) in 5.8S (Liston et al., 1996). d The conserved motifs are almost always present in this group. The one exception is that E. globoidea includes a mutation in the GAATTGCAGAATC motif in 5.8S. Fig. 5. One of the 24,522 most parsimonious trees (CI D 0.54, RI D 0.84) from an analysis combining rDNA paralogues from this study with representative sequences from other eucalypts (Steane et al., 1999, 2002; Udovicic and Ladiges, 2000). Circled branches were not present on the strict consensus. Branch lengths are proportional to the number of inferred character state changes; bootstrap values are shown where >50%. New sequences, from this study, are indicated by asterisks (¤). Some samples with identical sequences (or sequences diVering only by autapomorphies) are represented by single terminal taxa as follows: EUC1 D E. amygdalina 1, E. coccifera 1, E. coccifera 3, E. croajingolensis, E. elata, E. piperita, E. tindaliae, E. pulchella 2, E. risdonii, E. tenuir- amis 1, E. tenuiramis 2, E. willisii subsp. falciformis, E. willisii subsp. willisii; EUC2 D E. delegatensis, E. nitida; EUC3 D E. jacksonii, E. staeri, E. marginata. The nodes labelled D and E, also recovered in previous ITS studies (Steane et al., 2002), are not supported by analyses of morphological or ETS data, and are not considered to accurately reXect the phylogenetic relationships of taxa (Hill and Johnson, 1995; Bohte and Drinnan, 2005; Parra-O et al., 2006).
M.J. Bayly, P.Y. Ladiges / Molecular Phylogenetics and Evolution 44 (2007) 346–356 353 Sequence diversiWcation among the “divergent” types that these primers amplify only “divergent” sequences is dominated by substitutions that are potentially methyl- across a range of taxa, it seems likely that PCR selection ation-induced (as discussed above), and knowledge of this (Wagner et al., 1994), rather than PCR drift, is responsible attribute was used to assess the chance that the grouping for this ampliWcation pattern. Consistent with this notion is of the two “divergent” sequence types (at node B, Fig. 5) the observation, from a number of trial PCRs, that ampliW- is artifactual. A hypothetical ancestral sequence (esti- cation of “divergent” sequences is repeatable using primers mated by optimising character states on the parsimony S3/S5 and the given reaction conditions. Selective ampliW- tree, using DELTRAN) for the trichotomous node on cation of putative pseudogenes over functional rDNA which the divergent sequences sit (node A, Fig. 5), sequences was also reported by Buckler et al. (1997), whose includes 61 potential CpG and CpNpG methylation sites experiments suggested it related to diVerences in secondary (from both strands combined), at which 21 methylation- structure between the two sequence types. When compared induced substitutions are inferred on branches leading to with pseudogene sequences, functional rDNA has a higher the Western Australian “divergent” group, and four on GC content (e.g., Fig. 4A). This creates a greater potential branches leading to the eastern Australian “divergent” for within-strand complementarity, which can result in the group. Assuming such substitutions happen progressively formation of hairpins and other secondary structures in with equal chance at any suitable site, the probability (p) single-stranded DNA that can interfere with primer bind- of a given number of shared methylation-induced substi- ing and extension during PCR (Baldwin et al., 1995; Hers- tutions was estimated using the hypergeometric distribu- hkovitz et al., 1999), making functional rDNA more tion (i.e., assuming an initial pool of 61 sites and two diYcult to amplify. independent samples of 21 and four sites, respectively). For two shared substitutions (two steps being the length 4.2. Functionality of “divergent” sequences of the branch uniting the “divergent” groups) p D 0.31. This number of shared substitutions could, therefore, be Several lines of evidence suggest that “divergent” reasonably expected by chance, making it possible that sequences have diVerentiated in the absence of functional the grouping of eastern “divergent” with western “diver- constraints, and are likely to represent non-functional gent” sequences is artifactual. rDNA pseudogenes. First, they show similar levels of varia- The phylogenetic analysis presented here does not, how- tion in both the spacer (ITS1) and 5.8S coding region ever, include all of the character states shared by the “diver- (Table 2, Fig. 3); in functional rDNA, variation is expected gent” groups, relative to “typical” sequences. This is to be higher in the spacers than coding regions (Hershko- because alignment of ITS sequences across the eucalypts vitz et al., 1999). Second, they show elevated levels of varia- (although not within subg. Eucalyptus) is ambiguous, and tion between species when compared with “typical” some alignment positions were excluded from analysis. In sequences (Table 2, Fig. 3); this indicates a higher substitu- ITS1 there are three regions of ambiguous alignment tion rate in “divergent sequences,” and is consistent with (Fig. 3), in a particularly GC rich part of the sequence, in the relaxation of functional constraints. Third, substantial which the western and eastern “divergent” groups share numbers of methylation-induced substitutions can be four consistent diVerences relative to “typical” sequences. inferred (Fig. 2), the result being a reduction in the number Whether these shared states are potentially synapomorphic of CpG CpNpG methylation sites (relative to typical for a monophyletic “divergent” group cannot be deter- sequences), lower GC content, and lower secondary struc- mined without phylogenetic analysis. If, however, a tree ture stability (Fig. 4). topology similar to that in Fig. 5 was assumed, two of the In addition, the Western Australian “divergent” four shared characters (both C ! T transitions, but not at sequences show mutations in motifs, in both ITS 1 and the CpG, CpNpG sites) would be unique to “divergent” 5.8S gene, that are widely conserved among plants. The sequences relative to subg. Eucalyptus and E. tenuipes, and high level of conservation of these motifs (Liu and Schardl, could potentially add to the branch length below node B. 1994; Liston et al., 1996; Jobes and Thien, 1997) is presum- ably related to functional constraints. This is likely to be 4. Discussion true, not only for the 5.8S gene, but also in ITS1, which, post-transcription, is involved in the complex process of 4.1. PCR ampliWcation of paralogues ribosome synthesis. Liu and Schardl (1994), for instance, speculated that the structure of ITS1 plays a role in present- PCR ampliWcation of diVerent paralogues by diVerent ing sequence motifs (potentially including the conserved primer pairs in this study was found by chance, and the rea- AAGGAA motif) that are recognised by enzymes involved sons for this phenomenon have not been investigated. It is in ribosome processing. not known if the ITS4 and ITS5 primer sites are present in “divergent” sequences, since these sites are external to the 4.3. Relationships of paralogues recovered sequences. In the case of primers S3 and S5, how- ever, sequence comparison shows that the primer sites are Although some relationships are not clearly established, present in both “typical” and “divergent” sequences. Given Fig. 5 suggests that “divergent” pseudogenes diVerentiated
354 M.J. Bayly, P.Y. Ladiges / Molecular Phylogenetics and Evolution 44 (2007) 346–356 within the Eucalyptus clade, i.e., after the separation of that implications for studies of eucalypt phylogeny based on clade from the lineage including Corymbia and Angophora, ITS sequences. The paralogues reported in this study are but prior to diVerentiation of species groups in subg. Euca- highly diVerentiated, with the divergent pseudogenes being lyptus. That the paralogues diverged within Eucalyptus is conspicuous (in terms the number and pattern of implied supported by the high bootstrap value (96%) for node C on substitutions) when compared with presumably functional Fig. 5. That they are early divergences within Eucalyptus is sequences from closely related taxa. However, if diVerences supported by their placement on the tree, and by the fact between rDNA paralogues were more subtle, or if para- that highly similar pseudogene sequences (united with logues were compared among groups with sparse taxon strong bootstrap support) are shared among species, sug- sampling, their presence could be more diYcult to detect. gesting that pseudogenes in these species were inherited Preferential ampliWcation and sequencing of diVerent para- from ancestral taxa. At the very least, the western group of logues in diVerent lineages would also be more likely to go pseudogenes was established prior to the diVerentiation of undetected. E. acies, E. insularis, and E. sepulcralis; and the eastern ITS sequences have been used extensively in phyloge- group was established before diVerentiation of E. globoidea, netic analyses of eucalypts (Steane et al., 1999, 2002; E. lacrimans, E. paliformis, E. spectatrix and E. triXora. Udovicic and Ladiges, 2000), with more than 170 The relationship of the eastern and western Australian sequences now available in GenBank. They have also been groups of “divergent” sequences to each other is not clearly used in the molecular dating study of Crisp et al. (2004). established. This is because bootstrap support for their rela- The possibility that diVerent rDNA paralogues have been tionship is low (
M.J. Bayly, P.Y. Ladiges / Molecular Phylogenetics and Evolution 44 (2007) 346–356 355 (Steane et al., 1999, 2002), and additional sources of data cial support from the Maud Gibson Trust, RBG Mel- are required for clariWcation of relationships. ETS bourne and RBG Sydney. sequences show promise for supplementing ITS data (Parra-O et al., 2006), and chloroplast sequences can also be References useful (Whittock et al., 2003), although the comparison of a number of chloroplast regions showed these to be less vari- Álvarez, I., Wendel, J.F., 2003. Ribosomal ITS sequences and plant phylo- able in eucalypts than ITS (Udovicic and Ladiges, 2000). genetic inference. Mol. Phylogenet. Evol. 29, 417–434. Studies of Tasmanian members of subg. Eucalyptus have Arnheim, N., 1983. Concerted evolution in multigene families. In: Nei, M., also highlighted complex patterns of chloroplast relation- Koehn, R. (Eds.), Evolution of Genes and Proteins. Sinauer, Sunder- land, MA, pp. 38–61. ship among taxa that might confound phylogeny recon- Bailey, C.D., Carr, T.G., Harris, S.A., Hughes, C.E., 2003. Characterization struction using these uni-parentally inherited markers of angiosperm nrDNA polymorphism, paralogy, and pseudogenes. (Mckinnon et al., 1999). Low copy nuclear genes have not Mol. Phylogenet. Evol. 29, 435–455. been widely explored for use in phylogenetic analyses of Baldwin, B.G., 1993. Molecular phylogenetics of Calycadenia (Com- eucalypts, but use of the single copy cinnamoyl CoA reduc- positae) based on ITS sequences of nuclear ribosomal DNA: chro- mosomal and morphological evolution reexamined. Am. J. Bot. 80, tase gene (Poke et al., 2003), involved in lignin synthesis, 222–238. has been shown to be problematic because of historical Baldwin, B.G., Sanderson, M.J., Porter, J.M., Wojciechowski, M.F., Camp- recombination events (Poke et al., 2006). bell, C.S., Donoghue, M.J., 1995. The ITS region of nuclear ribosomal The possibility that rDNA pseudogenes could provide DNA: a valuable source of evidence on angiosperm phylogeny. Ann. variation useful for phylogeny reconstruction is worthy of Mo. Bot. Garden 82, 247–277. Bender, J., 2003. DNA methylation and epigenetics. Ann. Rev. Plant Biol. investigation. This would, of course, require clear identiW- 55, 41–68. cation of orthologous/paralogous copies, so that compari- Bohte, A., Drinnan, A., 2005. Floral development and systematic position sons were based only on orthologues, as well as of Arillastrum, Allosyncarpia, Stockwellia and Eucalyptopsis (Myrta- identiWcation of suitable outgroup sequences. The use of ceae). Plant Syst. Evol. 251, 53–70. pseudogenes is appealing primarily because they show Brooker, M.I.H., 2000. A new classiWcation of genus Eucalyptus L’Hér (Myrtaceae). Aust. Syst. Bot. 13, 79–148. increased variation among taxa when compared with pre- Buckler, E.S., Ippolito, A., Holtsford, T.P., 1997. The evolution of ribo- viously used sequence data. The patterns of relationship somal DNA: divergent paralogues and phylogenetic implications. among pseudogenes in this study, particularly within the Genetics 145, 821–832. Western Australian “divergent” group are also consistent Childs, G., Maxson, R., Cohn, R.H., Kedes, L., 1981. Orphons: dispersed with patterns of relationship inferred by both the “typi- genetic elements derived from tandem repetitive genes of eucaryotes. Cell 23, 651–663. cal” sequences and by previous morphological phyloge- Crisp, M., Cook, L., Steane, D., 2004. Radiation of the Australian Xora: netic analyses (Ladiges et al., 1987). Degenerate rDNA what can comparisons of molecular phylogenies across multiple taxa copies might be expected to independently accumulate tell us about the evolution of diversity in present-day communities? similar mutations (particularly methylation-induced sub- Phil. Trans. R. Soc. Lond. B 359, 1551–1571. stitutions), increasing the possibility of potentially con- Felsenstein, J., 1978. Cases in which parsimony and compatibility methods will be positively misleading. Syst. Zool. 27, 401–410. founding homoplasy. Homoplasy, however, is virtually Gardiner-Garden, M., Sved, J.A., Frommer, M., 1992. Methylation sites in ubiquitous feature of all phylogenetic datasets and non- angiosperm genes. J. Mol. Evol. 34, 219–230. random events characterise the evolution of a range of Hartmann, S., Nason, J.D., Bhattacharya, D., 2001. Extensive ribosomal sequence types, e.g., selective constraints on functional DNA genic variation in the columnar cactus Lophocereus. J. Mol. Evol. 53, 124–134. rDNA can bias the occurrence of particular mutations Hershkovitz, M.A., Lewis, L.A., 1996. Deep-level diagnostic value of the (Hershkovitz et al., 1999), also potentially resulting in rDNA-ITS region. Mol. Biol. Evol. 13, 1276–1295. homoplasy. The use of rDNA pseudogenes would need to Hershkovitz, M.A., Zimmer, E.A., Hahn, W.J., 1999. Ribosomal DNA be investigated with due caution, but the chance of turn- sequences and angiosperm systematics. In: Hollingsworth, P.M., Bat- ing a potential hindrance to phylogenetic analysis (the eman, R.M., Gornall, R.J. (Eds.), Molecular Systematics and Plant presence of divergent paralogues) to some advantage is an Evolution. Taylor & Francis, London, pp. 268–326. Hill, K.D., Johnson, L.A.S., 1995. Systematic studies in the eucalypts 7. A revi- interesting prospect considering the relatively low levels sion of the bloodwoods, genus Corymbia (Myrtaceae). Telopea 6, 185–504. of sequence divergence so far discovered in Eucalyptus Hopper, S.D., 1979. Biogeographical aspects of speciation in the southwest using common phylogenetic markers. Australian Xora. Ann. Rev. Ecol. Syst. 10, 399–422. Hopper, S.D., Gioia, P., 2004. The southwest Australian Xoristic region: evolution and conservation of a global hot spot of biodiversity. Ann. Acknowledgments Rev. Ecol. Syst. 35, 623–650. Jobes, D.V., Thien, L.B., 1997. A conserved motif in the 5.8S ribosomal Dean Nicolle provided access to specimens at the Cur- RNA (rRNA) gene is a useful diagnostic marker for plant internal rency Creek Arboretum, and Emma Lewis and Carlos transcribed spacer (ITS) sequences. Plant Mol. Biol. Rep. 15, 326–334. Parra assisted with collections made there. We thank Dan Käss, E., Wink, M., 1997. Molecular phylogeny and phylogeography of Lupinus (Leguminosae) inferred from nucleotide sequences of the rcbL Murphy, Carlos Parra, Alison Kellow, Gareth Holmes and gene and ITS 1 + 2 regions of rDNA. Plant Syst. Evol. 208, 139–167. Ed Newbigin for helpful discussions and comments on this Kemp, E.M., 1981. Tertiary palaeogeography and the evolution of Austra- manuscript. This work was funded by an Australian lian climate. In: Keast, A. (Ed.), Ecological Biogeography of Australia. Research Council Linkage (ARCL) grant, including Wnan- Junk, The Hague, pp. 31–50.
356 M.J. Bayly, P.Y. Ladiges / Molecular Phylogenetics and Evolution 44 (2007) 346–356 Kovarik, A., Pires, J.C., Leithch, A.R., Lim, K.Y., Sherwood, A.M., Maty- nuclear gene (cinnamoyl CoA reductase). Mol. Phylogenet. Evol. 39, asek, R., Rocca, J., Soltis, D.E., Soltis, P.S., 2005. Rapid concerted evo- 160–170. lution of nuclear ribosomal DNA in two Tragopogon allopolyploids of Pryor, L.D., Johnson, L.A.S., 1971. A ClassiWcation of the Eucalypts. Aus- recent and recurrent origin. Genetics 169, 931–944. tralian National University Press, Canberra. Ladiges, P.Y., Humphries, C.J., Brooker, M.I.H., 1987. Cladistic and bio- Rambaut, A., 1996. Se-Al: Sequence Alignment Editor. . L’Herit., informal subgenus Monocalyptus Pryor & Johnson. Aust. J. RazaWmandimbison, S.G., Kellogg, E.A., Bremer, B., 2004. Recent origin Bot. 35, 251–281. and phylogenetic utility of divergent ITS putative pseudogenes: a case Ladiges, P.Y., Udovicic, F., Nelson, G., 2003. Australian biogeographic study from Naucleeae (Rubiaceae). Syst. Biol. 53, 177–192. connections and the phylogeny of large genera in the plant family Sanderson, M.J., Doyle, J.J., 1992. Reconstruction of organismal and gene Myrtaceae. J. Biogeogr. 30, 989–998. phylogenies from data on multigene families: Concerted evolution, Lieckfeldt, E., Seifert, K.A., 2000. An evaluation of the use of ITS homoplasy, and conWdence. Syst. Biol. 41, 4–17. sequences in the taxonomy of the Hypocreales. Stud. Mycol. 45, 35–44. Steane, D.A., McKinnon, G.E., Vaillancourt, R.E., Potts, B.M., 1999. ITS Liston, A., Robinson, W.A., Oliphant, J.M., Alvarez-Buylla, E.R., 1996. sequence data resolve higher level relationships among the eucalypts. Length variation in the nuclear ribosomal DNA internal transcribed Mol. Phylogenet. Evol. 12, 215–223. spacer region of non-Xowering seed plants. Syst. Bot. 21, 109–120. Steane, D.A., Nicolle, D., McKinnon, G.E., Vaillancourt, R.E., Potts, B.M., Liu, J.S., Schardl, C.L., 1994. A conserved sequence in internal transcribed 2002. Higher level relationships among the eucalypts are resolved by spacer 1 of plant nuclear rRNA genes. Plant Mol. Biol. 26, 775–778. ITS-sequence data. Aust. Syst. Bot. 5, 49–62. Mayol, M., Roselló, J.A., 2001. Why nuclear ribosomal spacers (ITS) tell SwoVord, D.L., 2000. PAUP* 4.0: Phylogenetic Analysis Using Parsimony diVerent stories in Quercus. Mol. Phylogenet. Evol. 19, 167–176. (*and Other Methods). Sinauer Associates, Sunderland, Massachusetts. Mckinnon, G.E., Steane, D.A., Potts, B.M., Vaillancourt, R.E., 1999. Udovicic, F., Ladiges, P.Y., 2000. Informativeness of nuclear and chloro- Incongruence between chloroplast and species phylogenies in Eucalyp- plast DNA regions and the phylogeny of the eucalypts and related gen- tus subgenus Monocalyptus (Myrtaceae). Am. J. Bot. 86, 1038–1046. era (Myrtaceae). Kew Bull. 55, 633–645. Muir, G., Flemming, C.C., Schlötterer, C., 2001. Three divergent rDNA Vairapandi, M., Duker, N.J., 1994. Excision of ultraviolet-induced photo- clusters predate the species divergence in Quercus petraea (Matt.) Liebl. products of 5-methylcytosine from DNA. Mutat. Res. 315, 85–94. and Quercus robur L. Mol. Biol. Evol. 18, 112–119. Wagner, A., Blackstone, N., Cartwright, P., Dick, M., Misof, B., Snow, P., Nelson, E.C., 1981. Phytogeography of southern Australia. In: Keast, A. Wagner, G., Bartels, J., Murtha, M., Pendleton, J., 1994. Surveys of (Ed.), Ecological Biogeography of Australia. Junk, The Hague, pp. gene families using polymerase chain reaction: PCR selection and PCR 733–759. drift. Syst. Biol. 43, 250–2614. Ng, H., Bird, A., 1999. DNA methylation and chromatin modiWcation. White, T.J., Bruns, T.D., Lee, S.B., Taylor, J.W., 1990. AmpliWcation and Curr. Opin. Genet. Dev. 9, 158. direct sequencing of fungal ribosomal RNA Genes for phylogenetics. Nicolle, D., 2003. Currency Creek Arboretum (CCA) Eucalypt Research. In: Innis, N., Gelfand, D., Sninsky, J., White, T. (Eds.), PCR—Proto- Volume 2. D. Nicolle, Adelaide, Australia. cols and Applications—A Laboratory Manual. Academic Press, New Parra-O, C., Bayly, M., Udovicic, F., Ladiges, P., 2006. ETS sequences sup- York, pp. 315–322. port the monophyly of the eucalypt genus Corymbia (Myrtaceae). Whittock, S., Steane, D.A., Vaillancourt, R.E., Potts, B.M., 2003. Molecu- Taxon 55, 653–663. lar evidence shows that the tropical boxes (Eucalyptus subgenus Minu- Poke, F.S., Vaillancourt, R.E., Elliott, R.C., Reid, J.B., 2003. Sequence varia- tifructus) are over-ranked. Trans. R. Soc. S. Aust. 127, 27–32. tion in two lignin biosynthesis genes, cinnamoyl CoA reductase (CCR) Won, H., Renner, S.S., 2005. The internal transcribed spacer of nuclear and cinnamyl alcohol dehydrogenase 2 (CAD2). Mol. Breed. 12, 107–118. ribosomal DNA in the gymnosperm Gnetum. Mol. Phylogenet. Evol. Poke, F.S., Martin, D.P., Steane, D.A., Vaillancourt, R.E., Reid, J.B., 2006. 36, 581–597. The impact of intragenic recombination on phylogenetic reconstruc- Zuker, M., 2003. Mfold web server for nucleic acid folding and hybridiza- tion at the sectional level in Eucalyptus when using a single copy tion prediction. Nucleic Acids Res. 31, 3406–3415.
You can also read