QUANG and HUYNH: Complete chloroplast genome of Syzygium glomeratum (Myrtaceae) and phylogenetic analysis
Abstract
Syzygium glomeratum (Lam.) DC., known as “Tram Tron” in Vietnam, is an evergreen tree known for its medicinal properties. To understand the genomic and evolutionary basis of this plant, we sequenced and assembled the complete chloroplast genome of S. glomeratum for the first time, using the Illumina platform. The complete chloroplast of S. glomeratum was 158,469 bp in length and contained a large single-copy region (87,962 bp) and a small single-copy region (18,391 bp) separated by inverted repeat regions (26,058 bp). The genome encoded 85 protein-coding genes, 37 tRNA genes, and 8 rRNA genes. A phylogenetic analysis of 42 Syzygium species revealed significant insight into their evolutionary relationships, indicating a closer relationship between S. glomeratum and its sister group (S. cinereum and S. claviflorum). In conclusion, our results provide genomic information pertaining to the chloroplast genome in S. glomeratum, which is also genetic information useful for further study of biodiversity, conservation, and evolutionary biology.
Keywords: chloroplast genome, phylogenetic analysis, Syzygium glomeratum
INTRODUCTION
Syzygium (Myrtaceae) is a large genus comprising over 1,200 species of trees and shrubs natively distributed in tropical and subtropical regions of Asia and Africa ( POWO, 2024). Many species within this genus hold significant ecological, economic, and medicinal value, and they are used for ornamental purposes, timber production, edible fruits, and traditional medicines ( Rani et al., 2021; Uddin et al., 2022). Syzygium glomeratum (Lam.) DC. is a large evergreen tree native to the Indomalayan region and widely cultivated in tropical and subtropical areas for its ornamental qualities and edible fruits ( Nigam et al., 2012). In Vietnam, S. glomeratum, known locally as “Tram Tron,” is documented in the “An Illustrated Flora of Vietnam” ( Ho, 1999). Recently, an antibacterial activity against methicillin-resistant Staphylococcus aureus was reported in the plant ( Mai et al., 2020).
Despite its widespread distribution, economic importance, and potential medicinal applications, genetic information on S. glomeratum remains lacking, limiting our understanding of its phylogenetic position within Syzygium and its potential applications in genomics-based studies.
Genomic markers from chloroplast (cp) offer a valuable tool for resolving phylogenetic relationships across various taxonomic levels due to their conserved structure, gene content, uniparental inheritance, and relative stability ( Wicke et al., 2011). In addition, comparative genomic analyses provide insights into the evolutionary histories and diversification patterns within plant lineages ( Jansen and Ruhlman, 2012).
In this study, we report the complete cp genome of S. glomeratum obtained by next-generation sequencing and de novo assembly. We also provide detailed gene content and structural features, and phylogenetic trees to elucidate the evolutionary relationships of S. glomeratum within the Syzygium genus. The complete sequence of the cp genome in S. glomeratum may offer a great genomic tool for phylogenetic studies in Syzygium.
MATERIALS AND METHODS
Plant sampling and DNA isolation
Leaf samples were procured from an individual S. glomeratum tree specimen located in Phuoc Minh district, Tay Ninh province, Vietnam (11°19′55.3″N, 106°18′02.0″E). A taxonomic expert verified the plant identity, and the voucher specimen was labeled UMP_2024.02.05_TramTron and deposited in the herbarium of the Faculty of Pharmacy, University of Medicine and Pharmacy at Ho Chi Minh City. Fresh leaves were desiccated using silica gel at room temperature and stored for subsequent experiments. No permits were required to collect S. glomeratum samples.
Total genomic DNA was isolated from dehydrated leaf tissues following the modified cetyltrimethylammonium bromide protocol ( Porebski et al., 1997). The extracted DNA was further purified using a commercial Monarch Genomic DNA Purification Kit (#T3010, New England Biolabs, Ipswhich, MA, USA) according to the manufacturer’s instructions. The extracted genomic DNA, with a quality and purity (A 280/ 260 ratio of approximately 1.8–2.0), was stored at –20°C until it was used to construct the sequencing library.
Sequencing, assembly, and annotation of the cp genome
Library preparation was accomplished using the NEBNext Ultra II DNA Library Prep kit (#E7103, New England Biolabs). High-throughput sequencing (2 × 150 bp) was performed using a MiSeq sequencer (Illumina, San Diego, CA, USA). After quality control using FastQC ( Andrews, 2010) and purification using Trimmomatic ( Bolger et al., 2014), the clean reads were used to de novo assemble the complete cp genome using GetOrganelle v1.7.7.0 ( Jin et al., 2020) and NOVOPlasty v4.3.1 ( Dierckxsens et al., 2016) pipelines with the cp genome of Syzygium polyanthum (Wight) Walp. (accession number: NC_072979) as a reference ( Nguyen et al., 2023). The assembled S. glomeratum cp genome was annotated using the GeSeq tool ( Tillich et al., 2017). All protein-coding genes and tRNA genes were confirmed by BLAST and tRNAscan-SE v2.0 ( Chan and Lowe, 2019), respectively. The annotated gene content was manually curated using Geneious Prime v2024.0.2. A circular map of the cp genome was generated using OGDRAW v1.3.1 ( Greiner et al., 2019).
Phylogenetic analysis
A total of 42 Syzygium cp genomes were retrieved from the NCBI GenBank database ( https://www.ncbi.nlm.nih.gov/) ( Table 1). In addition, the cp genome of Backhousia citriodora F. Muell. (accession number ON422330), a member of the Myrtaceae family was used as an outgroup. To reconstruct the phylogenetic tree, individual protein-coding genes (PCGs) from all cp genomes were extracted and aligned using Geneious Prime v2024.0.2. The TrimAl tool was used to remove gaps and poorly aligned regions from the sequence alignments ( Capella-Gutiérrez et al., 2009). Afterward, the aligned PCGs were concatenated in Geneious Prime, resulting in a combined alignment dataset. The optimal nucleotide substitution model was GTR+I+G, identified using jModelTest v2 ( Posada, 2008). Bayesian inference (BI) and maximum likelihood (ML) phylogenetic trees were reconstructed using MrBayes v3.2.7a ( Huelsenbeck and Ronquist, 2001) and IQTREE v1.6.12 ( Nguyen et al., 2015), respectively. For the ML analysis, IQ-TREE was executed with 1,000 bootstrap replicates. The BI analysis was run for 1,000,000 generations, which resulted in a split frequency lower than 0.01. The resulting ML and BI phylogenetic trees were visualized and annotated using FigTree v1.4.4 ( http://tree.bio.ed.ac.uk/software/figtree/).
RESULTS AND DISCUSSION
Genome features of the S. glomeratum cp genome
The complete cp genome of S. glomeratum was successfully assembled and annotated in this study and was deposited in GenBank under accession number PP734015. This circular cp genome map was 158,469 bp in length and had an overall GC content of 37.0% ( Fig. 1). This genome revealed a typical quadripartite structure consisting of a large single-copy region (LSC, 87,962 bp in length) and a small single-copy region (SSC, 18,391 bp in length) separated by two inverted repeat regions (IRa and IRb, 26,058 bp). The GC content of LSC, SSC, and IR regions were 34.8%, 31.0%, and 42.7%, respectively. Generally, the cp genome of S. glomeratum was similar to the typical quadripartite cp genome of angiosperms ( Jansen and Ruhlman, 2012). Compared with the reported cp genomes of Syzygium species, no special structural variations (i.e., gene loss, IR loss, and large inversion) were found in the cp genome of S. glomeratum.
A total of 130 genes were annotated in the S. glomeratum cp genome, including 85 PCGs, 8 ribosomal RNA genes, and 37 transfer RNA genes ( Table 2). The gene content and orientation are highly conserved compared with other cp genomes of Syzygium species ( Asif et al., 2013; Zhang et al., 2019; Chen et al., 2022; Nguyen et al., 2023). Among the 130 annotated genes, 18 contained introns, with 15 genes having a single intron and 3 genes ( rpl2, rps12, and clpP) containing two introns. Seventeen coding genes were duplicated in IR regions, including six PCGs ( rpl2, rpl23, ycf2, ndhB, rps7, rps12), seven tRNA genes ( trnI-CAU, trnL-CAA, trnV-GAC, trnI-GAU, trnA-UGC, trnR-ACG, trnN-GUU), and four rRNA genes ( rrn16, rrn23, rrn4.5, rrn5). The rps12 was identified as a trans-splicing gene with three exons located in distinct regions of the cp genome. In particular, exon 1 was located in the LSC region, and exons 2 and 3 were in the IR regions.
Phylogenetic analysis
We used ML and BI methods for reconstructing phylogenetic trees based on the cp genome CDS of 43 species ( S. glomeratum, 41 Syzygium species, and B. citriodora used as an outgroup). The resulting ML and BI trees had an identical topology with strongly supported values (bootstrap > 70 and posterior probability > 0.95) in many nodes ( Fig. 2). A close relationship among six Syzygium species was previously reported based on morphological markers ( Cheong and Ranghoo-Sanmukhiya, 2013). It has been shown that S. glomeratum has a closer relationship with sister groups (including S. venosum, S. coriaceum, and S. petrinense) ( Cheong and Ranghoo- Sanmukhiya, 2013). In this study, based on 79 PCGs from the cp genome, S. glomeratum formed a monophyletic group with S. cinereum and S. claviflorum. Furthermore, the tree topology closely aligns with prior phylogenetic studies based on cp genomic ( Eguiluz et al., 2017; Nguyen et al., 2023; Huynh et al., 2024; Sun et al., 2024). However, the recent study by Low et al. (2022), which utilized single nucleotide polymorphism data from nuclear loci in the Syzygium genus, shows inconsistency with our findings ( Low et al., 2022). Specifically, it indicated that S. glomeratum is closely related to S. buxifolium rather than S. cinereum ( Low et al., 2022). This incongruence between nuclear and cp phylogenies could be attributed to factors, such as: convergent evolution, lineage sorting, or reticulate evolution ( Nishimoto et al., 2003; Yu et al., 2013). Syzygium is a large genus with more than 1200 species, but only a few cp genome sequences of Syzygium were reported, further studies are needed to understand fully resolved evolutionary and phylogenetic relationships within the genus.
CONCLUSION
In this study, we successfully assembled and characterized the complete cp genome sequence of S. glomeratum, a commercially important tree species within the Myrtaceae family. The assembled cp genome exhibited a typical quadripartite structure with 113 unique genes (including 79 PCGs, 30 tRNAs, and 4 rRNAs), which was conserved and similar to other taxa in the genus Syzygium. Phylogenetic analysis of 79 PCGs of the cp genome provided robust insights into the evolutionary relationships of S. glomeratum. These results supported a monophyly clade, including S. glomeratum, S. cinereum, and S. claviflorum, and further resolved its position in the Syzygium genus. These findings corroborate the current taxonomic classification of S. glomeratum and contribute to a better understanding of the evolutionary history and diversification within the genus Syzygium.
ACKNOWLEDGMENTS
Minh Trong Quang was funded by the Master, PhD Scholarship Program of Vingroup Innovation Foundation, code VINIF.2021.ThS.69 and VINIF.2022.ThS.054. The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Fig. 1.
The map of Syzygium glomeratum chloroplast genome. The genome includes a large single-copy (LSC) region, a small single-copy (SSC) region, and two inverted repeat regions (IRA and IRB). Genes positioned outside the circle are transcribed clockwise, whereas those inside the circle are transcribed counterclockwise. Different functional groups of genes are signed by color coding. The inner gray circle represents the GC content.
Fig. 2.
The phylogenetic relationship of Syzygium glomeratum within the Syzygium genus was inferred from 79 concatenated PCGs of the cp genome using the maximum likelihood and bayesian inference methods. Numbers above the branches represent bootstrap (maximum likelihood method) and posterior probabilities (bayesian inference method) values. The bootstrap of 100 and posterior probabilities of 1 are not shown. The complete cp genome of S. glometarum is shown in bold and underlined.
Table 1.
List of GenBank accession numbers for reference sequences used for phylogenetic analysis.
No. |
Name of species |
GenBank accession numbers |
1 |
Syzygium acuminatissimum
|
NC_053640 |
2 |
Syzygium adelphicum
|
NC_084350 |
3 |
Syzygium alatum
|
NC_084380 |
4 |
Syzygium album
|
NC_060587 |
5 |
Syzygium aromaticum
|
NC_047249 |
6 |
Syzygium australe
|
NC_082025 |
7 |
Syzygium bamagense
|
NC_086714 |
8 |
Syzygium branderhorstii
|
NC_084381 |
9 |
Syzygium buettnerianum
|
NC_084382 |
10 |
Syzygium buxifolium
|
NC_084371 |
11 |
Syzygium caryophyllatum
|
NC_087814 |
12 |
Syzygium cinereum
|
NC_086715 |
13 |
Syzygium cladopterum
|
NC_084383 |
14 |
Syzygium claviflorum
|
NC_087811 |
15 |
Syzygium cumini
|
NC_053327 |
16 |
Syzygium effusum
|
NC_084386 |
17 |
Syzygium fluviatile
|
NC_082026 |
18 |
Syzygium forrestii
|
NC_044106 |
19 |
Syzygium garcinioides
|
NC_084387 |
20 |
Syzygium glomeratum
|
PP734015 |
21 |
Syzygium grijsii
|
NC_065156 |
22 |
Syzygium jambos
|
NC_052728 |
23 |
Syzygium jiewhoei
|
NC_084388 |
24 |
Syzygium malaccense
|
NC_052867 |
25 |
Syzygium megacarpum
|
NC_082027 |
26 |
Syzygium nervosum
|
NC_053907 |
27 |
Syzygium odoratum
|
NC_059005 |
28 |
Syzygium pachycladum
|
NC_084389 |
29 |
Syzygium paniculatum
|
NC_087812 |
30 |
Syzygium polyanthum
|
NC_072979 |
31 |
Syzygium puberulum
|
NC_087813 |
32 |
Syzygium rehderianum
|
NC_065261 |
33 |
Syzygium roemeri
|
NC_084390 |
34 |
Syzygium saliciforme
|
NC_084391 |
35 |
Syzygium samarangense
|
NC_060657 |
36 |
Syzygium sayeri
|
NC_084392 |
37 |
Syzygium suberosum
|
NC_084393 |
38 |
Syzygium taeniatum
|
NC_084394 |
39 |
Syzygium tierneyanum
|
NC_084395 |
40 |
Syzygium tsoongii
|
NC_082028 |
41 |
Syzygium tympananthum
|
NC_084396 |
42 |
Syzygium versteegii
|
NC_084397 |
43 |
Backhousia citriodora
|
ON422330 |
Table 2.
A list of genes was annotated in the cp genome of Syzygium glomeratum.
Groups of genes (No.) |
Name of the genes |
Ribosomal RNAs (4) |
rrn4.5 (2×), rrn5 (2×), rrn16 (2×), rrn23 (2×) |
Transfer RNAs (30) |
trnA-UGCa(2×), trnC-GCA, trnD-GUC, trnE-UUC, trnF-GAA, trnG-UCCa, trnG-GCC, trnH-GUG, trnI-GAUa(2×), trnK-UUUa, trnL-CAA (2×), trnL-UAAa, trnL-UAG, trnfM-CAU, trnM-CAU (2×), trnM-CAU, trnN-GUU (2×), trnP-UGG, trnQ-UUG, trnR-ACG (2×), trnR-UCU, trnS-GCU, trnS-GGA, trnS-UGA, trnT-GGU, trnT-UGU, trnV-GAC (2×), trnV-UACa, trnW-CCA, trnY-GUA
|
Large units of ribosomes (9) |
rpl2a(2×), rpl14, rpl16a, rpl20, rpl22, rpl23 (2×), rpl32, rpl33, rpl36
|
Small units of ribosomes (12) |
rps2, rps3, rps4, rps7 (2×), rps8, rps11, rps12b(2×), rps14, rps15, rps16a, rps18, rps19
|
RNA polymerase (4) |
rpoA, rpoB, rpoC1a, rpoC2
|
Translational initiation factor (1) |
infA
|
Subunit of photosystem I (7) |
psaA, psaB, psaC, psaI, psaJ, pafIb, pafII
|
Subunit of photosystem II (15) |
psbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbN, psbM, psbT, psbZ
|
Subunit of cytochrome (6) |
petA, petBa, petDa, petG, petL, pbfI
|
Subunit of ATP synthases (6) |
atpA, atpB, atpE, atpF1, atpH, atpI
|
Large unit of Rubisco (1) |
rbcL
|
Subunit of NADH dehydrogenase (11) |
ndhAa, ndhBa(2×), ndhC, ndhD, ndhE, ndhF, ndhG, ndhH, ndhI, ndhJ, ndhK
|
Maturase (1) |
matK
|
Envelope membrane protein (1) |
cemA
|
Subunit of acetyl-CoA (1) |
accD
|
C-type cytochrome synthesis gene (1) |
cssA
|
ATP-dependent protease subunit P (1) |
clpPb
|
Component of the TIC complex (1) |
ycf1
|
Hypothetical proteins and conserved reading frames (1) |
ycf2 (2×) |
LITERATURE CITED
Asif, H., Khan, A. Iqbal, A. Khan, I. A. Heinze, B. and Azim, M. K. 2013. The chloroplast genome sequence of Syzygium cumini (L.) and its relationship with other angiosperms. Tree Genetics & Genomes 9: 867-877.
Bolger, A. M., Lohse, M. and Usadel, B. 2014. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 30: 2114-2120.
Chen, L.-D., Wang, H.-F. and Hou, D.-J. 2022. The complete plastome of Syzygium odoratum (Lour.) DC. 1928 (Myrtaceae). Mitochondrial DNA Part B 7: 705-706.
Cheong, M. L. S. and Ranghoo-Sanmukhiya, V. M. 2013. Phylogeny of Syzygium species using morphological, RAPD and ISSR markers. International Journal of Agriculture & Biology 15: 511-516.
Dierckxsens, N., Mardulyn, P. and Smits, G. 2016. NOVOPlasty: De novo assembly of organelle genomes from whole genome data. Nucleic Acids Research 45: e18.
Eguiluz, M., Yuyama, P. M. Guzman, F. Rodrigues, N. F. and Margis, R. 2017. Complete sequence and comparative analysis of the chloroplast genome of Plinia trunciflora
. Genetics and Molecular Biology 40: 871-876.
Greiner, S., Lehwark, P. and Bock, R. 2019. OrganellarGenome-DRAW (OGDRAW) version 1.3.1: Expanded toolkit for the graphical visualization of organellar genomes. Nucleic Acids Research 47: W59-W64.
Ho, P. H. 1999. An Illustrated Flora of Vietnam. 2: Youth Publisher, Hanoi. Pp. 50 pp.
Huynh, T.-T. T, Quang, M. T. and Nguyen, H. D. 2024. The complete chloroplast genome of Syzygium syzygioides (Myrtaceae: Myrtales) and phylogenetic analysis. Biomedical and Biotechnology Research Journal (BBRJ) 8: 409-414.
Jansen, R. K. and Ruhlman, T. A. 2012. Plastid genomes of seed plants. Genomics of Chloroplasts and Mitochondria. Advances in Photosynthesis and Respiration. 35: Springer, Dordrecht. Pp. 103-126.
Low, Y. W., Rajaraman, S. Tomlin, C. M. Ahmad, J. A. Ardi, W. H. Armstrong, K. Athen, P. Berhaman, A. Bone, R. E. Cheek, M. Cho, N. R. W. Choo, L. M. Cowie, I. D. Crayn, D. Fleck, S. J. Ford, A. J. Forster, P. I. Girmansyah, D. Goyder, D. J. Gray, B. Heatubun, C. D. Ibrahim, A. Ibrahim, B. Jayasinghe, H. D. Kalat, M. A. Kathriarachchi, H. S. Kintamani, E. Koh, S. L. Lai, J. T. K. Lee, S. M. L. Leong, P. K. F. Lim, W. H. Lum, S. K. Y. Mahyuni, R. McDonald, W. J. F. Metali, F. Mustaqim, W. A. Naiki, A. Ngo, K. M. Niissalo, M. Ranasinghe, S. Repin, R. Rustiami, H. Simbiak, V. I. Sukri, R. S. Sunarti, S. Trethowan, L. A. Trias-Blasi, A. Vasconcelos, T. N. C. Wanma, J. F. Widodo, P. Wijesundara, D. S. A. Worboys, S. Yap, J. W. Yong, K. T. Khew, G. S. W. Salojärvi, J. Michael, T. P. Middleton, D. J. Burslem, D.F. R. P. Lindqvist, C. Lucas, E. J. and Albert, V. A. 2022. Genomic insights into rapid speciation within the world’s largest tree genus Syzygium
. Nature Communications 13: 5031.
Mai, T. T. N. L., Hoang, H. A. and Truong, T. V. 2020. Antibacterial activity of tram tron Syzygium glomerulatum extract against methicillin-resistant Staphylococcus aureus
. Chemical Engineering Transactions 78: 235-240.
Nguyen, H. D., Vu, M. T. and Do, H. D. K. 2023. The complete chloroplast genome of Syzygium polyanthum (Wight) Walp. (Myrtales: Myrtaceae). Journal of Asia-Pacific Biodiversity 16: 267-271.
Nigam, V., Nigam, R. and Singh, A. 2012. Distribution and medicinal properties of Syzygium species. Current Research in Pharmaceutical Sciences 2: 73-80.
Nishimoto, Y., Ohnishi, O. and Hasegawa, M. 2003. Topological incongruence between nuclear and chloroplast DNA trees suggesting hybridization in the urophyllum group of the genus Fagopyrum (Polygonaceae). Genes & Genetic Systems 78: 139-153.
Porebski, S., Bailey, L. G. and Baum, B. R. 1997. Modification of a CTAB DNA extraction protocol for plants containing high polysaccharide and polyphenol components. Plant Molecular Biology Reporter 15: 8-15.
Posada, D. 2008. jModelTest: Phylogenetic model averaging. Molecular Biology and Evolution 25: 1253-1256.
Rani, S. P. S., Kurup, S. R. R. Nair, S. A. Beevi, P. N. Thankappan, S. and Baby, S. 2021. Antiproliferative activity of leaf, fruit pericarp essential oils of Syzygium palodense
. Phytomedicine Plus 1: 100128.
Sun, Z., Zhang, Y. Zou, S. Zhang, S. and Feng, C. 2024. Complete chloroplast genomes of four Syzygium species and comparative analysis with other Syzygium species. Biologia 79: 45-58.
Yu, W.-B., Huang, P.-H. Li, D.-Z. and Wang, H. 2013. Incongruence between nuclear and chloroplast DNA phylogenies in Pedicularis section Cyathophora (Orobanchaceae). PLoS ONE 8: e74828.
Zhang, X.-F., Wang, J.-H. Wang, H.-X. Zhao, K.-K. Zhu, Z.-X. and Wang, H.-F. 2019. Complete plastome sequence of Syzygium forrestii Merr. et Perry (Myrtaceae): An endemic species in China. Mitochondrial DNA Part B 4: 126-127.
|
|