Complete chloroplast genomes of Lespedeza inschanica and L. juncea (Fabaceae: section Junceae)
Article information
Abstract
Species of the section Junceae (Lespedeza, Fabaceae) have been used in traditional medicine, and their extracts have been investigated as medicinal resources in Korea. However, their morphological similarities may lead to misidentification and misuse. In this study, we analyzed the chloroplast genomes of Lespedeza inschanica and Lespedeza juncea to obtain more evidence to assist with the classification of this section. Both genomes are approximately 149 kb long, comprising large single-copy, small single-copy, and inverted repeat regions, with a total of 128 genes. The genomes in Lespedeza exhibit shared features, including the loss of infA and rpl22 genes, the loss of rpl2 and rps12 introns, and inversion of the trnD-Y-E cluster. A phylogenetic analysis supported Junceae as a monophyletic section that includes two clades. One clade comprised Lespedeza cuneata, L. inschanica, and L. juncea, whereas the other includes Lespedeza davurica and Lespedeza floribunda. This study provides foundational data pertaining to the taxonomy of Lespedeza.
INTRODUCTION
The genus Lespedeza Michx., belonging to the family Fabaceae, comprises approximately 40 species and is distributed throughout Asia and eastern North America (Ohashi and Nemoto, 2014). A total of 16 species of Lespedeza have been identified in Korea (Choi, 2007; Han and Choi, 2007, 2008; Chung et al., 2023). This genus is composed of two subgenera, Lespedeza and Macrolespedeza (Maxim.) H. Ohashi based on seedling morphology and molecular analyses (Ohashi and Nemoto, 2014). Among these subgenera, Macrolespedeza is exclusively distributed in certain regions of Asia and is divided into two sections: Junceae (Maxim.) H. Ohashi & T. Nemoto, and Macrolespedeza Maxim. (Ohashi and Nemoto, 2014).
Section Junceae is characterized by herbaceous or subshrubby plants and possesses both chasmogamous and cleistogamous flowers (Ohashi and Nemoto, 2014). Some species of sect. Junceae, such as Lespedeza cuneata (Dum. Cours.) G. Don and Lespedeza pilosa (Thunb.) Siebold & Zucc., have been used as traditional medicinal herbs to invigorate the stomach, liver, and kidneys. They have been studied as medicinal resources for reduction of various inflammatory disease (Huang et al., 2010; Kim and Kim, 2010; Kim et al., 2011). In particular, L. cuneata has been reported to have high flavonoid and polyphenol content and excellent antioxidant activity (Kim et al., 2012). However, L. cuneata is morphologically similar to L. juncea (L. f.) Pers. and L. inschanica (Maxim.) Schindl., leading to various perspectives for species identification. Lespedeza inschanica has been considered as a variety of L. juncea (Lee, 1965) or a hybrid between L. juncea and L. davurica (Laxm.) Schindl. (Choi, 2007). In addition, such a morphological similarity among the Lespedeza species could hinder accurate utilization of these species as biological resources. In this study, we investigate the chloroplast genomes of L. inschanica and L. juncea to determine the species boundaries within the Junceae.
MATERIALS AND METHODS
We collected fresh leaves of L. inschanica and L. juncea from the Sinduri beach, Taean-gun, Korea (36.85190°N, 126.19904°E). Voucher specimens (collection numbers: L. inschanica, 239010; L. juncea, 239002) were deposited in the herbarium of the National Institute of Biological Resources (KB). DNA was extracted using an Axen Plant DNA Mini Kit (Macrogen, Seoul, Korea). Genomic DNA was sequenced using an Illumina NovaSeq X Platform (Macrogen). In total, 36,787,194 (L. juncea) and 37,748,849 (L. inschanica) paired-end reads (150 bp each) were obtained. These reads were mapped onto the chloroplast genome of L. cuneata (GenBank accession number: NC057455) using NOVOPlasty 4.3.5 (Dierckxsens et al., 2017). We reassembled the reads of L. inschanica and L. juncea using a map-to-reference application in Geneious Prime 2024.0.5 (Biomatters Ltd., Auckland, NZ) to confirm the sequence. Lowquality reads were removed using Trimmomatic 0.32 (Bolger et al., 2014). Gene annotation of the confirmed sequence was implemented when nucleotide sequences of the plastid genes of tested species showed >90% similarity with the reference genome. Some protein-coding genes were manually identified by considering their start and stop codons. Transfer RNAs (tRNAs) were confirmed using tRNAscan-SE (Lowe and Chan, 2016) and compared with the tRNAs of other species. Circular maps of L. inschanica and L. juncea plastomes were generated using OGDRAW v1.31 (Greiner et al., 2019). The final genome sequences were uploaded to GenBank (https://www.ncbi.nlm.nih.gov/) (L. inschanica, PQ652319; L. juncea, PQ652320). Nine additional genome sequences, i.e., L. bicolor Trucz. (NC046836), L. buergeri Miq. (NC061375), L. cuneata (NC057455), L. davurica (NC042748), L. floribunda Bunge (MH800327), L. maritima Nakai (MG867570), L. tricolor (Nakai) D. P. Jin & J. W. Park & B. H. Choi (NC064210), Kummerowia striata (Thunb.) Schindl. (MG867569), and Campylotopis macrocarpa (Bunge) Rehder (MG867566), were obtained from GenBank to analyze the phylogenetic relationships of Lespedeza species. Kummerowia striata and C. macrocarpa were used as outgroups. For each taxon, 67 conserved protein-coding genes were selected, and their sequences were aligned using the default parameters of Geneious Alignment. For tree construction using maximum likelihood (ML) analysis, we selected a nucleotide substitution model using jModelTest 2.1.6. (Darriba et al., 2012). In this process, 88 substitution models were compared, incorporating a gamma distribution for site heterogeneity, and the TVM + G model was selected based on the Akaike information criterion. The ML tree was constructed using IQTREE 1.6.12 (Trifinopoulos et al., 2016) with 1,000 replications. Polymorphic simple sequence repeats (SSRs) identified in chloroplast genomes have been extensively investigated owing to their frequent use in species identification, population genetics, and phylogenetic studies. The MISA microsatellite finder was used to detect the SSRs (Beier et al., 2017). The parameters were set to ten repeats for mononucleotides, five for dinucleotides, four for trinucleotides, and three for tetra-, penta-, and hexanucleotides. L. cuneata was used for the comparison SSRs (NCBI accession number: NC057455).
RESULTS AND DISCUSSION
The chloroplast genomes of L. inschanica and L. juncea were 149,015 and 148,975 bp long, respectively, and included large single-copy (LSC) regions (L. inschanica, 82,421 bp; L. juncea, 82,383 bp), small single-copy (SSC) regions (L. inschanica, 18,932 bp; L. juncea, 18,930 bp), and two inverted repeats (IRs) (23,831 bp), with an overall GC content of 35.0% (Fig. 1, Table 1). Each genome harbored 128 genes, including 83 protein-coding genes, 8 rRNA genes, and 37 tRNA genes (Table 1, 2). The chloroplast genome lengths and numbers of genes were similar to those of other Lespedeza species (Jin et al., 2019a; Somaratne et al., 2019; Kim et al., 2022; Wang et al., 2022). Two genes typical of Fabaceae, i.e., infA and rpl22, were lost, and the introns of rpl2 and rps12 were absent, similar to other species of the tribe Desmodieae. An inversion of the gene cluster trnD-GUC–trnY-GUA–trnE-UUC (trnD-YE), as previously identified in the tribe Desmodieae (Jin et al., 2019a), was also confirmed in the present study. Within the tribe, the subtribe Lespedezinae (including genera Campylotropis Bunge, Kummerowia Schindl., Lespedeza) showed a 500-bp deletion located on the side of trnD-Y-E (Jin et al., 2019a). However, no significant differences were observed between the Lespedeza sections in this region. Five protein-coding genes (ndhB, rpl2, rpl23, rps12, and ycf2), seven tRNA genes (trnA-UGC, trnI-CAU, trnI-GAU, trnL-CAA, trnN-GUU, trnR-ACG, and trnV-GAC), and four rRNA genes (4.5S, 5S, 16S, and 23S rRNA) were duplicated between the two IRs.

Chloroplast genome maps of Lespedeza inschanica and Lespedeza juncea. Genes outside the outer circle are transcribed clockwise, whereas those inside the outer circle are transcribed counterclockwise. Colored rectangles indicate functional genes, with categories shown at the bottom left. The gray scale in the inner circle indicates the GC content of the plastid genome.
Analysis of SSRs within the chloroplast genome revealed 73, 71, and 77 SSRs in L. inschanica, L. juncea, and L. cuneata, respectively (Fig. 2). The A/T repeat motif was the most abundant (Fig. 2), occurring 35, 33, and 41 times in L. inschanica, L. juncea, and L. cuneata, respectively. Subsequently, the AT/AT repeat motif of di-nucleotide was the second most abundant (Fig. 2).

Graph of simple sequence repeats (SSRs) in the chloroplast genomes of three Lespedeza species. Lespedeza cuneata is analyzed in the previous study (Somaratne et al., 2019). A. Repeat motif type of SSRs in the chloroplast genome. B. Characterization of SSRs in the chloroplast genome.
The section Junceae formed a monophyletic group in the ML tree, while the section Macrolespedeza was supported as paraphyletic (Fig. 3). Within the section Juncea, L. cuneata, L. inschanica, and L. juncea formed a clade denoted as Clade 1, which was sister to a clade named Clade 2 that included L. davurica and L. floribunda (Fig. 3). These results are consistent with the morphological classification based on the inflorescences of chasmogamous flowers (Choi, 2007). Species in Clade 1 usually have shorter inflorescences than leaves (sessile or nearly so), whereas species in Clade 2 usually have longer inflorescences than leaves (pedunculate). However, our chloroplast genome data differ from those of previous molecular phylogenetic studies that used nuclear ribosomal internal transcribed spacers (Han et al., 2010; Xu et al., 2017; Jin et al., 2019b), which showed that L. inschanica and L. juncea were more closely related to L. davurica than they are to L. cuneata. The ITS results were in accordance with the morphological classification of Huang et al. (2010). Although Huang et al. (2010) also recognized the shortness of inflorescences of L. cuneata, they concluded that the inflorescences of L. inschanica and L. juncea were almost equal to those of the leaves. However, the scope of our data was limited in identifying the precise phylogenetic relationship, as the present study did not include all species of the section Junceae,. Thus, our chloroplast genome data should be used in further studies to confirm this hypothesis.

A maximum likelihood (ML) tree of genus Lespedeza, based on 63 coding genes of chloroplast genomes. The number after the scientific name represents the GenBank accession of the species. The bootstrap value is shown on the node. The scale bar on the bottom indicates substitution per site. The newly sequenced individual is marked with an asterisk.
Acknowledgements
This study was supported by a grant from the National Institute of Biological Resources (NIBR), Ministry of Environment (MOE) of, Republic of Korea (NIBR202323101, NIBR202413104).
Notes
CONFLICTS OF INTEREST
The authors declare that there are no conflicts of interest.