Complete chloroplast genome sequence of Clematis calcicola (Ranunculaceae), a species endemic to Korea
Article information
Abstract
The complete chloroplast genome (cp genome) sequence of Clematis calcicola J. S. Kim (Ranunculaceae) is 159,655 bp in length. It consists of large (79,451 bp) and small (18,126 bp) single-copy regions and a pair of identical inverted repeats (31,039 bp). The genome contains 92 protein-coding genes, 36 transfer RNA genes, eight ribosomal RNA genes, and two pseudogenes. A phylogenetic analysis based on the cp genome of 19 taxa showed high similarity between our cp genome and data published for C. calcicola, which is recognized as a species endemic to the Korean Peninsula. The complete cp genome sequence of C. calcicola reported here provides important information for future phylogenetic and evolutionary studies of Ranunculaceae.
INTRODUCTION
The genus Clematis L. (Ranunculaceae) contains ca. 300 species of herbaceous or woody vines and a few erect shrubs and perennial grasses, which are common in temperate regions, including northern Europe, Siberia, and the Far East, but some species also occur in tropical regions (Tamura, 1955, 1967, 1987, 1995; Wang and Bartholomew, 2001; Wang and Li, 2005; Kadota, 2006; Chang, 2007; Kim, 2017a; Chang and Kim, 2018; Park et al., 2022). Clematis calcicola J. S. Kim, is endemic to the Korean Peninsula (Kim et al., 2009; National Institute of Biological Resources, 2011; Chung et al., 2017; Kim, 2017a, 2017b; Lee and Kim, 2018). This species is easily distinguishable from closely related taxa by its sparsely dentate, glabrous, and subcoriaceous leaflets and thick and smooth sepals (Kim et al., 2009; Kim, 2017a). However, C. calcicola has limited distribution in Korea. In the present study, the complete chloroplast genome (chloroplast genome) of C. calcicola is reported for the first time, to the best of our knowledge. A phylogenetic analysis was also carried out to investigate the relationships of C. calcicola with other Clematis species.
Chloroplasts are metabolic organelles with an important role in the physiology and development of terrestrial plants and algae (Gray, 1989; Howe et al., 2003). The chloroplast genome is a suitable tool for studying the evolution and phylogeny of plants because of its highly conserved sequence and structure (Maier et al., 1995). Chloroplasts have their own genetic replication mechanisms and they transcribe their genome relatively independently (Fu et al., 2016). Most chloroplast genomes range 120 to 200 kb in length and exhibit a typical quadripartite structure, including a large single-copy sequence (LSC, 80–90 kb), a small single-copy sequence (SSC, 16–27 kb), and two inverted repeat sequences (IRs, 20–28 kb) with subequal length (Palmer, 1985; Wang et al., 2008). The completion of cpDNA genome provides a large amount of information, including not only related information on protein-coding and non-coding genes but also data to infer gene rearrangement and evolutionary relationships (Golenberg et al., 1993; Reith and Munholland, 1995).
The complete chloroplast genome of C. calcicola, and the phylogenetic analysis results reported in the present study provide information on the relationship between C. calcicola and related species, which may be a valuable resource for future studies on this species.
MATERIALS AND METHODS
Clematis calcicola were sampled from Mt. Deokhang, Samcheok-si, Gangwon-do, South Korea (37o18′49.0″N, 129o00′41.5″E). The collected material was stored at the Herbarium of the Korea National Arboretum (KH contact, Dong Chan Son) under voucher Deokhangsan-190529-001.
Fresh leaves were silica-dried. Leaf DNA was extracted using a DNeasy Plant Mini Kit (Qiagen, Seoul, Korea) and verified on 2% agarose gel electrophoresis. The DNA library was constructed using the TruSeq Nano DNA Kit following the Sample Preparation Guide protocol provided by the manufacturer (Macrogen Inc., Seoul, Korea). Genome paired-end sequencing was performed at Macrogen Inc. on an Illumina platform (Illumina Inc., San Diego, CA, USA) based on 301 bp read size.
The complete chloroplast genome was assembled using Geneious v9.0.5 (Biomatters, Auckland, New Zealand) and annotated using the GeSeq tool (Tillich et al., 2017) and Geneious v9.0.5 (Biomatters).
To infer the phylogenetic relationships among Clematis species, the complete chloroplast genome sequences of 20 Clematis species were downloaded from GenBank (https://www.ncbi.nlm.nih.gov/genbank/). Anemoclema glaucifolium (Franch.) W. T. Wang was used as the outgroup. Phylogenetic analysis was performed using 78 coding sequences of the Clematis species. Alignments were performed using MAFFT v7.450 (Katoh et al., 2002; Katoh and Standley, 2013). A maximum likelihood (ML) bootstrap analysis with 1,000 replicates was performed and the best-fit model (transversion [TVM] + empirical base frequencies [F] + freeRate model parameters with two of categories [R2]) was determined using the IQ-tree web server (Trifinopoulos et al., 2016). Bayesian inference (BI) analysis was conducted using MrBayes v3.2.6 (Ronquist et al., 2012) in PhyloSuite (Zhang et al., 2020). The best model of molecular evolution for the chloroplast genome dataset (general time reversible [GTR] + proportion of invariable sites [I] + rate of variation across sites [G] + F) was obtained in ModelFider v2.0 (Kalyaanamoorthy et al., 2017).
RESULTS AND DISCUSSION
Clematis calcicola sequencing produced 6,657,552 reads, 7,417 of which corresponded to the chloroplast genome (depth 5.4). The complete chloroplast genome sequence of C. calcicola was 159,655 bp (Fig. 1) and it was deposited in GenBank under accession number OK181902 and associated BioProject, SRA, and Bio-Sample numbers PRJNA886061, SRR21776408, and SAMN31117752, respectively. The LSC, SSC, and each of IRs of the chloroplast genome sequence were 79,451 bp, 18,126 bp, and 31,039 bp in length, respectively. The overall GC content of the chloroplast genome was 38%, and in the LSC, SSC, and each of IRs it was 36.3%, 31.3%, and 42%, respectively. The chloroplast genome contained 136 unique genes, including 92 coding genes, eight ribosomal RNA (rRNA) genes, 36 transfer RNA (tRNA) genes. Twenty-six genes (15 coding genes, four tRNA, and seven rRNA) were duplicated in the IRs. Also, infA identified as a pseudogene was included in the IRs (Table 1).
We used sequences from 21 species (20 species of Clematis and one species of Anemoclema as an outgroup) to infer ML-based and BI-based phylogenies. Both phylogenies (ML and BI) had the same topology and showed high support for each branch (Fig. 2). The phylogenetic analysis formed a monophyly of the genus Clematis. Nevertheless, it did not match the classification system divided into sections (Wang and Li, 2005; Kim et al., 2009). However, C. calcicola and C. macropetala Ledeb. belonging to the section Atragene formed a group and was well supported (bootstrap value [BS] = 100; posterior probability [PP] = 1), forming a sister group to the section Cheiropsis (BS = 98; PP = 1), and branched first within the genus Clematis to form a basal group. We performed a comparative analysis with the chloroplast genome of C. macropetala, which was identified as the most closely related to C. calcicola in phylogenetic analysis (Fig. 2). The chloroplast genome of C. calcicola was 11 bp longer than that of C. macropetala (159,647 bp, NC_041477). Nevertheless, C. calcicola and C. macropetala chloroplast genomes pairwise identity was 99.96%.
Sixteen single nucleotide polymorphisms and 52 insertions/deletions (indels) were identified in the pairwise alignment of Clematis chloroplast genomes (Tables 2 and 3). Of the 52 indels, 16 were found in four protein-coding regions (clpP1, ndhF, ccsA, and ycf1) and 36 in the non-coding region between the two plastomes (Table 3).
Clematis calcicola can be morphologically distinguished from C. macropetala by terete branches, ternate leaves, leaflet margins slightly dentate to dentate and subcoriaceous, and staminodes linear-spatulate to spatulate (Kim et al., 2009; Yang et al., 2009). Although morphologically similar, the results of the present study support that C. calcicola is closely related but independent from C. macropetala, as evidenced by the genome differences and phylogenetic analysis. Also, C. macropetala is distributed in Central to Northeastern of China, E Mongolia, and Southeastern Russia, but C. calcicola is distributed limited in Korea.
Recently, three species of the Korean Clematis have been reported (Choi et al., 2021; Park et al., 2021). The results of this study will provide important basic information for future phylogenetic and evolutionary studies of Clematis and Ranunculaceae by adding one species to them. In addition, comparative data between the two species within section Atragene of the genus Clematis will provide useful information for developing species identification markers and conducting genetic diversity analyses.
Acknowledgements
This study was supported by the grant ‘Silvics of Korea’ [KNA-1-1-18, 15-3] financed by the Korea National Arboretum.
Notes
CONFLICTS OF INTEREST
The authors declare that there are no conflicts of interest.