Complete chloroplast genome sequence of Clematis calcicola (Ranunculaceae), a species endemic to Korea

Article information

Korean J. Pl. Taxon. 2022;52(4):262-268
Division of Forest Biodiversity, Korea National Arboretum, Pocheon 11186, Korea
Corresponding author Sang-Chul KIM, E-mail: majin01@korea.kr
Received 2022 November 2; Revised 2022 December 14; Accepted 2022 December 23.

Abstract

The complete chloroplast genome (cp genome) sequence of Clematis calcicola J. S. Kim (Ranunculaceae) is 159,655 bp in length. It consists of large (79,451 bp) and small (18,126 bp) single-copy regions and a pair of identical inverted repeats (31,039 bp). The genome contains 92 protein-coding genes, 36 transfer RNA genes, eight ribosomal RNA genes, and two pseudogenes. A phylogenetic analysis based on the cp genome of 19 taxa showed high similarity between our cp genome and data published for C. calcicola, which is recognized as a species endemic to the Korean Peninsula. The complete cp genome sequence of C. calcicola reported here provides important information for future phylogenetic and evolutionary studies of Ranunculaceae.

INTRODUCTION

The genus Clematis L. (Ranunculaceae) contains ca. 300 species of herbaceous or woody vines and a few erect shrubs and perennial grasses, which are common in temperate regions, including northern Europe, Siberia, and the Far East, but some species also occur in tropical regions (Tamura, 1955, 1967, 1987, 1995; Wang and Bartholomew, 2001; Wang and Li, 2005; Kadota, 2006; Chang, 2007; Kim, 2017a; Chang and Kim, 2018; Park et al., 2022). Clematis calcicola J. S. Kim, is endemic to the Korean Peninsula (Kim et al., 2009; National Institute of Biological Resources, 2011; Chung et al., 2017; Kim, 2017a, 2017b; Lee and Kim, 2018). This species is easily distinguishable from closely related taxa by its sparsely dentate, glabrous, and subcoriaceous leaflets and thick and smooth sepals (Kim et al., 2009; Kim, 2017a). However, C. calcicola has limited distribution in Korea. In the present study, the complete chloroplast genome (chloroplast genome) of C. calcicola is reported for the first time, to the best of our knowledge. A phylogenetic analysis was also carried out to investigate the relationships of C. calcicola with other Clematis species.

Chloroplasts are metabolic organelles with an important role in the physiology and development of terrestrial plants and algae (Gray, 1989; Howe et al., 2003). The chloroplast genome is a suitable tool for studying the evolution and phylogeny of plants because of its highly conserved sequence and structure (Maier et al., 1995). Chloroplasts have their own genetic replication mechanisms and they transcribe their genome relatively independently (Fu et al., 2016). Most chloroplast genomes range 120 to 200 kb in length and exhibit a typical quadripartite structure, including a large single-copy sequence (LSC, 80–90 kb), a small single-copy sequence (SSC, 16–27 kb), and two inverted repeat sequences (IRs, 20–28 kb) with subequal length (Palmer, 1985; Wang et al., 2008). The completion of cpDNA genome provides a large amount of information, including not only related information on protein-coding and non-coding genes but also data to infer gene rearrangement and evolutionary relationships (Golenberg et al., 1993; Reith and Munholland, 1995).

The complete chloroplast genome of C. calcicola, and the phylogenetic analysis results reported in the present study provide information on the relationship between C. calcicola and related species, which may be a valuable resource for future studies on this species.

MATERIALS AND METHODS

Clematis calcicola were sampled from Mt. Deokhang, Samcheok-si, Gangwon-do, South Korea (37o18′49.0″N, 129o00′41.5″E). The collected material was stored at the Herbarium of the Korea National Arboretum (KH contact, Dong Chan Son) under voucher Deokhangsan-190529-001.

Fresh leaves were silica-dried. Leaf DNA was extracted using a DNeasy Plant Mini Kit (Qiagen, Seoul, Korea) and verified on 2% agarose gel electrophoresis. The DNA library was constructed using the TruSeq Nano DNA Kit following the Sample Preparation Guide protocol provided by the manufacturer (Macrogen Inc., Seoul, Korea). Genome paired-end sequencing was performed at Macrogen Inc. on an Illumina platform (Illumina Inc., San Diego, CA, USA) based on 301 bp read size.

The complete chloroplast genome was assembled using Geneious v9.0.5 (Biomatters, Auckland, New Zealand) and annotated using the GeSeq tool (Tillich et al., 2017) and Geneious v9.0.5 (Biomatters).

To infer the phylogenetic relationships among Clematis species, the complete chloroplast genome sequences of 20 Clematis species were downloaded from GenBank (https://www.ncbi.nlm.nih.gov/genbank/). Anemoclema glaucifolium (Franch.) W. T. Wang was used as the outgroup. Phylogenetic analysis was performed using 78 coding sequences of the Clematis species. Alignments were performed using MAFFT v7.450 (Katoh et al., 2002; Katoh and Standley, 2013). A maximum likelihood (ML) bootstrap analysis with 1,000 replicates was performed and the best-fit model (transversion [TVM] + empirical base frequencies [F] + freeRate model parameters with two of categories [R2]) was determined using the IQ-tree web server (Trifinopoulos et al., 2016). Bayesian inference (BI) analysis was conducted using MrBayes v3.2.6 (Ronquist et al., 2012) in PhyloSuite (Zhang et al., 2020). The best model of molecular evolution for the chloroplast genome dataset (general time reversible [GTR] + proportion of invariable sites [I] + rate of variation across sites [G] + F) was obtained in ModelFider v2.0 (Kalyaanamoorthy et al., 2017).

RESULTS AND DISCUSSION

Clematis calcicola sequencing produced 6,657,552 reads, 7,417 of which corresponded to the chloroplast genome (depth 5.4). The complete chloroplast genome sequence of C. calcicola was 159,655 bp (Fig. 1) and it was deposited in GenBank under accession number OK181902 and associated BioProject, SRA, and Bio-Sample numbers PRJNA886061, SRR21776408, and SAMN31117752, respectively. The LSC, SSC, and each of IRs of the chloroplast genome sequence were 79,451 bp, 18,126 bp, and 31,039 bp in length, respectively. The overall GC content of the chloroplast genome was 38%, and in the LSC, SSC, and each of IRs it was 36.3%, 31.3%, and 42%, respectively. The chloroplast genome contained 136 unique genes, including 92 coding genes, eight ribosomal RNA (rRNA) genes, 36 transfer RNA (tRNA) genes. Twenty-six genes (15 coding genes, four tRNA, and seven rRNA) were duplicated in the IRs. Also, infA identified as a pseudogene was included in the IRs (Table 1).

Fig. 1.

Gene map of Clematis calcicola complete chloroplast genome. Genes within the circle are transcribed in clockwise direction, and genes drawn out of the circle in counterclockwise direction. Different gene colors correspond to different gene functions. Inverted repeat (IR), small single-copy (SSC), and large single-copy (LSC) regions are indicated.

List of genes annotated in the chloroplast genome of Clematis calcicola.

We used sequences from 21 species (20 species of Clematis and one species of Anemoclema as an outgroup) to infer ML-based and BI-based phylogenies. Both phylogenies (ML and BI) had the same topology and showed high support for each branch (Fig. 2). The phylogenetic analysis formed a monophyly of the genus Clematis. Nevertheless, it did not match the classification system divided into sections (Wang and Li, 2005; Kim et al., 2009). However, C. calcicola and C. macropetala Ledeb. belonging to the section Atragene formed a group and was well supported (bootstrap value [BS] = 100; posterior probability [PP] = 1), forming a sister group to the section Cheiropsis (BS = 98; PP = 1), and branched first within the genus Clematis to form a basal group. We performed a comparative analysis with the chloroplast genome of C. macropetala, which was identified as the most closely related to C. calcicola in phylogenetic analysis (Fig. 2). The chloroplast genome of C. calcicola was 11 bp longer than that of C. macropetala (159,647 bp, NC_041477). Nevertheless, C. calcicola and C. macropetala chloroplast genomes pairwise identity was 99.96%.

Fig. 2.

Phylogenetic tree of Clematis calcicola and related taxa based on 78 protein-coding gene sequences of chloroplast genome sequences using maximum likelihood (ML) and Bayesian inference (BI) analyses of different datasets. Numbers on each node are bootstrap values/posterior probability values. Anemoclema glaucifolium was set as the outgroup.

Sixteen single nucleotide polymorphisms and 52 insertions/deletions (indels) were identified in the pairwise alignment of Clematis chloroplast genomes (Tables 2 and 3). Of the 52 indels, 16 were found in four protein-coding regions (clpP1, ndhF, ccsA, and ycf1) and 36 in the non-coding region between the two plastomes (Table 3).

SNPs in Clematis calcicola relative to that of C. macropetala chloroplast genome.

Indels in Clematis calcicola relative to that of C. macropetala chloroplast genome.

Clematis calcicola can be morphologically distinguished from C. macropetala by terete branches, ternate leaves, leaflet margins slightly dentate to dentate and subcoriaceous, and staminodes linear-spatulate to spatulate (Kim et al., 2009; Yang et al., 2009). Although morphologically similar, the results of the present study support that C. calcicola is closely related but independent from C. macropetala, as evidenced by the genome differences and phylogenetic analysis. Also, C. macropetala is distributed in Central to Northeastern of China, E Mongolia, and Southeastern Russia, but C. calcicola is distributed limited in Korea.

Recently, three species of the Korean Clematis have been reported (Choi et al., 2021; Park et al., 2021). The results of this study will provide important basic information for future phylogenetic and evolutionary studies of Clematis and Ranunculaceae by adding one species to them. In addition, comparative data between the two species within section Atragene of the genus Clematis will provide useful information for developing species identification markers and conducting genetic diversity analyses.

Acknowledgements

This study was supported by the grant ‘Silvics of Korea’ [KNA-1-1-18, 15-3] financed by the Korea National Arboretum.

Notes

CONFLICTS OF INTEREST

The authors declare that there are no conflicts of interest.

References

Chang CS. 2007. Clematis L. The Genera of Vascular Plants of Korea In : Flora of Korea Editorial Committee, ed. Academy Publishing Co. Seoul: p. 191–195.
Chang CS, Kim H. 2018. Clematis L. The Genera of Vascular Plants of Korea In : Flora of Korea Editorial Committee, ed. Hongneung Science Publishing Co. Seoul: p. 253–259.
Choi KS., Ha Y-H, Gil H-Y, Choi K, Kim D-K, Oh S-H. 2021;Two Korean endemic Clematis chloroplast genomes: inversion, reposition, expansion of the inverted repeat region, phylogenetic analysis, and nucleotide substitution rates. Plants 10:397.
Chung GY, Chang KS, Chung J-M, Choi HJ, Paik W-K, Hyun J-O. 2017;A checklist of endemic plants on the Korean Peninsula. Korean Journal of Plant Taxonomy 47:264–288.
Fu P-C, Zhang Y-Z, Geng H-M, Chen S-L. 2016;The complete chloroplast genome sequence of Gentiana lawrencei var. farreri (Gentianaceae) and comparative analysis with its congeneric species. PeerJ 4:e2540.
Golenberg EM, Clegg MT, Durbin ML, Doebley J, Ma DP. 1993;Evolution of a noncoding region of the chloroplast genome. Molecular Phylogenetics and Evolution 2:52–64.
Gray MW. 1989;The evolutionary origins of organelles. Trends in Genetics 5:294–299.
Howe CJ, Barbrook AC, Koumandou VL, Nisbet RER, Symington HA, Wightman TF. 2003;Evolution of the chloroplast genome. Philosophical Transactions of the Royal Society of London Series B Biology Sciences 358:99–107.
Kadota Y. 2006. Clematis L. In Flora of Japan. IIa. Angiospermae, Dicotyledoneae, Archichlamydeae In : Iwatsuki K, Boufford DE, Ohba H, eds. Kodansha. Tokyo: p. 298–308.
Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS. 2017;ModelFinder: Fast model selection for accurate phylogenetic estimates. Nature Methods 14:587–589.
Katoh K, Misawa K, Kuma KI, Miyata T. 2002;MAFFT: A novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research 30:3059–3066.
Katoh K, Standley DM. 2013;MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Molecular Biology and Evolution 30:772–780.
Kim J-S, Chung J-M, Kim S-Y, Park J-H. 2009; Clematis calcicola J. S. Kim: A new species of Clematis sect. Atragene (Ranunculaceae) from Korea. Korean Journal of Plant Taxonomy 39:1–3.
Kim JS. 2017a. Clematis L. Flora of Korea 2aIn : National Institute of Biological Resources, ed. Incheon: p. 69–76.
Kim M. 2017b. Korean Endemic Plants Haejin Media Co. Ltd. Seoul: p. 76.
Lee JK, Kim HS. 2018. Clematis calcicola J. S. Kim. Silvics of Korea 2Oh BU, Oh SH. Korea National Arboretum of the Korea Forest Service. Pocheon: p. 63–71.
Maier RM, Neckermann K, Igloi GL, Kössel H. 1995;Complete sequence of the maize chloroplast genome: Gene content, hotspots of divergence and fine tuning of genetic information by transcript editing. Journal of Molecular Biology 251:614–628.
National Institute of Biological Resources. 2011. Endemic Species of Korea Geobook. Incheon: p. 331.
Palmer JD. 1985;Comparative organization of chloroplast genomes. Annual Review of Genetics 19:325–354.
Park BK, Ghimire B, Ha Y-H, Son DC, Kim D-K. 2021;Complete chloroplast genome of Clematis taeguensis (Ranunculaceae), an endemic species from South Korea. Mitochondrial DNA Part B Resources 6:1496–1497.
Park BK, Kim J-S, Chung GY, Kim J-H, Son DC, Jang C- G. 2022; Clematis pseudotubulosa (Ranunculaceae), a new species from Korea. Korean Journal of Plant Taxonomy 52:35–44.
Reith M, Munholland J. 1995;Complete nucleotide sequence of the Porphyra purpurea chloroplast genome. Plant Molecular Biology Reporter 13:333–335.
Ronquist F, Teslenko M, van der Mark P, Ayres DL, Darling A, Höhna S, Larget B, Liu L, Suchard MA, Huelsenbeck JP. 2012;MrBayes 3.2: Efficient Bayesian phylogenetic inference and model choice across a large model space. Systematic Biology 61:539–542.
Tamura M. 1955. Systema clematidis Asiae Orientalis. Science Reports 4College of General Education, Osaka University. Osaka: p. 43–55.
Tamura M. 1967. Morphology, Ecology and Phylogeny of the Ranunculaceae VII. Science Reports 16College of General Education, Osaka University. Osaka: p. 21–43.
Tamura M. 1987;A classification of genus Clematis . Acta Phytotaxonomica et Geobotanica 38:33–44.
Tamura M. 1995. Clematis . Die Natürlichen Pflanzenfamilien. Zweite Aufl 17a. IVIn : Hiepko P, ed. Dunker & Humbolt. Berlin: p. 368–387.
Tillich M., Lehwark P, Pellizzer T, Ulbricht-Jones ES, Fischer A, Bock R, Greiner S. 2017;GeSeq: Versatile and accurate annotation of organelle genomes. Nucleic Acids Research 45:W6–W11.
Trifinopoulos J., Nguyen L-T, von Haeseler A, Minh BQ. 2016;W-IQ-TREE: A fast online phylogenetic tool for maximum likelihood analysis. Nucleic Acids Research 44:W232–W235.
Wang R-J, Cheng C-L, Chang C-C, Wu C-L, Su T-M, Chaw S-M. 2008;Dynamics and evolution of the inverted repeat-large single copy junctions in the chloroplast genomes of monocots. BMC Evolutionary Biology 8:36.
Wang WT, Bartholomew B. 2001. Clematis L. Flora of China. 6Caryophyllaceae through Lardizabalaceae In : Wu ZY, Raven PH, Hong DY, eds. Science Press, Beijing and Missouri Botanical Garden Press. St Louis, MO: p. 333–386.
Wang W-T, Li L-Q. 2005;A new system of classification of the genus Clematis (Ranunculaceae). Acta Phytotaxonomica Sinica 43:431–488.
Yang W-J, Li L-Q, Xie L. 2009;A revision of Clematis sect. Atragene (Ranunculaceae). Journal of Systematics and Evolution 47:552–580.
Zhang D., Gao F, Jakovlić I, Zou H, Zhang J, Li WX, Wang GT. 2020;PhyloSuite: An integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies. Molecular Ecology Resources 20:348–355.

Article information Continued

Fig. 1.

Gene map of Clematis calcicola complete chloroplast genome. Genes within the circle are transcribed in clockwise direction, and genes drawn out of the circle in counterclockwise direction. Different gene colors correspond to different gene functions. Inverted repeat (IR), small single-copy (SSC), and large single-copy (LSC) regions are indicated.

Fig. 2.

Phylogenetic tree of Clematis calcicola and related taxa based on 78 protein-coding gene sequences of chloroplast genome sequences using maximum likelihood (ML) and Bayesian inference (BI) analyses of different datasets. Numbers on each node are bootstrap values/posterior probability values. Anemoclema glaucifolium was set as the outgroup.

Table 1.

List of genes annotated in the chloroplast genome of Clematis calcicola.

Category for genes Group of genes Name of genes
Self-replication Large subunit ribosomal proteins rpl2(×2)*, rpl14(×2), rpl16(×2)*, rpl20, rpl22(×2), rpl23(×2), rpl32, rpl33, rpl36
DNA-dependent RNA polymerase rpoA, rpoB, rpoC1*, rpoC2
Small subunit ribosomal proteins rps2, rps3(×2), rps4, rps7(×2), rps8(×2), rps11, rps12(×2)**T, rps14, rps15, rps16*, rps18, rps19(×2)
Ribosomal RNAs rrn4.5S(×2), rrn5S(×2), rrn16S(× 2), rrn23S(×2)
Transfer RNAs trnA-UGC(×2)*, trnC-GCA, trnD-GUC, trnE-UUC, trnF-GAA, trnfM-CAU, trnG-GCC, trnG-UCC*, trnH-GUG, trnI-GAU(×2)*, trnI-CAU(×2), trnK-UUU*, trnL-CAA(×2), trnL-UAA*, trnL-UAG, trnM-CAU, trnN-GUU(×2), trnP-UGG, trnQ-UUG, trnR-ACG(×2), trnR-UCU, trnS-GCU, trnS-GGA, trnS-UGA, trnT-GGU, trnV-GAC(×2), trnV-UAC*, trnW-CCA, trnY-GUA
Photosynthesis Subunits of ATP synthase atpA, atpB, atpE, atpF*, atpH, atpI
Subunits of NADH-dehydrogenase ndhA*, ndhB(×2)*, ndhC, ndhD, ndhE, ndhF, ndhG, ndhH, ndhI, ndhJ, ndhK
Subunits of cytochrome b/f complex petA, petB*, petD*, petG, petL, petN
Subunits of photosystem I psaA, psaB, psaC, psaI, psaJ
Subunits of photosystem II psbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbT, psbZ
Subunit of rubisco rbcL
Photosystem assembly factors pafI**, pafII
Photosystem biogenesis factor pbf1
Other genes Subunit of acetyl-CoA-carboxylase accD
C-type cytochrome synthesis gene ccsA
Envelop membrane protein cemA
ATP-dependent protease subunit P clpP**
Translational initiation factor ψinfA (×2)
Maturase matK
Unknown function Conserved open reading frames ycf1, ycf2(×2)
*

, genes containing one intron;

**

, genes containing two introns;

T

, trans-spliced genes;

(×2), genes have two copies;

ψ

, pseudogene.

Table 2.

SNPs in Clematis calcicola relative to that of C. macropetala chloroplast genome.

No. C. calcicola C. macropetala Site Region Gene Position
1 G A 2,332 LSC rps16-trnK IGS
2 A C 7,238 LSC psbA-trnH IGS
3 G A 7,479 LSC trnQ-UUG gene
4 T C 8,170 LSC psbK-psbI IGS
5 G A 40,398 LSC trnS-psbZ IGS
6 A C 70,138 LSC clpP1 intron intron
7 T A 79,713 IRB ψinfA-rps8 IGS
8 C A 110,507 SSC ndhF gene
9 T A 110,523 SSC ndhF gene
10 A T 115,484 SSC ccsA gene
11 T A 119,716 SSC ndhG-ndhI IGS
12 A T 119,774 SSC ndhG-ndhI IGS
13 T C 119,775 SSC ndhG-ndhI IGS
14 T C 119,786 SSC ndhG-ndhI IGS
15 C T 119,787 SSC ndhG-ndhI IGS
16 A T 159,421 IRA rps8-ψinfA IGS

SNP, single nucleotide polymorphism; LSC, large single-copy; IGS, intergenic spacer; IRB, inverted repeat B; SSC, small single-copy; IRA, inverted repeat A.

Table 3.

Indels in Clematis calcicola relative to that of C. macropetala chloroplast genome.

No. C. calcicola C. macropetala Start End Length Region Position
1 - T 7,095 7,095 1 LSC psbA-trnH
2 - T 10,568 10,568 1 LSC trnF-GAA-ndhJ
3 ATAT - 12,739 12,742 4 LSC ndhC-trnG
4 A - 35,889 35,889 1 LSC trnE-GCA-trnT
5 T - 36,506 36,506 1 LSC trnT-GGU-psbD
6 A - 37,116 37,116 1 LSC trnT-GGU-psbD
7 T - 68,885 68,885 1 LSC rpl20-rps12
8 TTTATATT - 70,561 70,568 8 LSC clpP1 intron
9 - A 75,155 75,155 1 LSC petB intron
10 - A 77,669 77,669 1 LSC petD-rpoA
11 A - 110,499 110,499 1 SSC ndhF
12 T - 113,545 113,545 1 SSC ndhF-rpl32
13 - A 115,931 115,931 1 SSC ccsA
14 - TTTTTA 116,163 116,168 6 SSC ccsA-ndhD
15 - AACTATCTA 116,224 116,232 9 SSC ccsA-ndhD
16 T - 118,371 118,371 1 SSC psaC-ndhE
17 - T 118,831 118,831 1 SSC ndhE-ndhG
18 TCTA - 119,779 119,782 4 SSC ndhG-ndhI
19 - A 123,994 123,994 1 SSC ndhH-rps15
20 T - 124,452 124,452 1 SSC rps15-ycf1
21 AATGAC - 128,627 128,632 6 SSC ycf1

LSC, large single-copy; SSC, small single-copy.