Accurate Identification of Abaca (Musa textilis Née) Cultivars Using Single Nucleotide Polymorphisms (SNP) Markers Developed for Banana (Musa acuminata Colla)
- 1. Philippine Fiber Industry Development Authority, Diliman, Philippines
- 2. USDA-ARS, Sustainable Perennial Crops Laboratory, USA
- 3. USDA-ARS, Tropical Agriculture Research Station, Mayaguez
Abstract
Abaca (Musa textilis Née) is a diploid Musa species native to the Philippines that is used to produce abaca fibers. The Philippines supplies 85 percent of the world supply for raw fiber, fiber craft, cordage and pulp, which provide livelihood opportunities for 1.5 million Filipinos. Cutting edge molecular markers is needed to support germplasm management and crop improvement of this understudied crop. The objective of the study is to adapt a set of SNP markers from banana and validate the efficacy of the SNP markers for abaca genotype identification. Using a nano-fluidic genotyping platform, we evaluated 384 putative Single Nucleotide Polymorphism (SNP) markers developed for diploid banana (Musa acuminata Colla), based on 62 abaca germplasm accessions. The cross-species transfer of nuclear SNP markers showed a 15.6% success rate, resulting in the selection of 60 polymorphic SNPs. The generated SNP profiles enabled accurate identification of all tested abaca cultivars and detection of homonymous naming mistakes. Cultivars with a background of inter-specific hybrid (e.g. M. textilis x M. balbisiana) were differentiated using multi-variant analysis and Bayesian stratification. These selected SNP markers will be highly useful for downstream applications in abaca industry, including cultivar identification, nursery accreditation, and authentication of abaca product and protection of intellectual property rights.
Citation
Galvez LC, Meinhardt LW, Goenaga R, Zhang D (2021) Accurate Identification of Abaca (Musa textilis Née) Cultivars Using Single Nucleotide Polymorphisms (SNP) Markers Developed for Banana (Musa acuminata Colla). Int J Plant Biol Res 9(1): 1125.
Keywords
• Manila hemp
• Molecular markers
• Natural fiber
• Industrial crop
• The Philippines
ABBREVIATIONS
SNP: Single Nucleotide Polymorphism; DNA: Deoxyribonucleic Acid; RNase: A Ribonuclease I or Ribonucleate 3′- pyrimidinooligonucleotidohydrolase; PCR: Polymerase Chain Reaction; IFC: Integrated Fluidic Circuit; PCoA: Principle Coordinates Analysis; PIC: Polymorphism Information Content; PCO: Principle Coordinates; STA: Specific Target Amplification
INTRODUCTION
Abaca (Musa textilis Née) is a monocotyledonous plant that is closely related to banana [1,2]. It is a member of the OrderZingiberales, FamilyMusaceae,under the sectionCallimusa/ Australimusa and has a diploid chromosome number of 20 (2n=2x=20) [3-6]. Abaca plant is the source of the world’s strongest natural fiber, internationally known as Manila hemp [1,7].
The abaca fiber market is expected to grow considerably due to an increasing global demand from pulp manufacturing and from abaca fiber industries [7,8]. The abaca is indigenous to the archipelagos of the Philippines and was grown in the country even before Spanish colonization. The cultivation, fiber extraction and weaving into cloth was widespread in the islands and abaca has been introduced into Sumatra, British Borneo, Malaya [9], Central America [10] andNew Caledonia and Queensland [9]. After World War II, abaca was introduced into and cultivated in Ecuador and other tropical American countries [11].
Since the onset of the 20th century, abaca fiber has become the premier export commodity of the Philippines with the abaca industry makes up a substantial part of the national GDP (US$131 M in export earnings in 2016) [12]. Abaca production in the Philippines accounts 87% of the world supply for raw fiber, fiber craft, cordage and pulp [12]. Moreover, the abaca industry is a major source of livelihood for nearly 1.5M Filipinos, which consist of 124,063 abaca farmers cultivating a total area of 141,614 hectares [12].
Due to current environmental concerns that focus on biodegradable products and forest conservation, abaca is a superior natural material that has an expanding industrial potential [12]. Abaca is now a preferred material in the production of pulp for specialty papers like tea bags, meat/sausage casings, cigarette paper, filter papers, currency notes, stencil paper and non-woven product applications and local and international companies are continuously developing new specialized products [7,8]. To cope with the growing demands for high quality fiber in the international market, abaca cultivars with favorable agronomic traits and quality attributes are needed. Systematic characterization of abaca genetic resources is a pre-requirement for effective selection and use of abaca germplasm [8].
Abaca is taxonomic complex because of the hybridization and polyploidization that had occurred naturally among abaca (i.e. cv Lausmag) and Musa species [13] including Musa balbisiana and Musa acuminata [1,2]. There are as many as 200 cultivars of abaca in the Philippines, mostly landraces, which are attributed to the planting of seeds in the early days of its domestication [9]. Duplications are possible because the same cultivar may be given a different name in different regions. There are more than 700 accessions of abaca maintained in field gene banks in the country [14], and there are still abaca plants in the wild [15]. To take fully advantage of the rich genetic diversity in the abaca germplasm collections, high standard of accuracy is essential for gene bank management. Each accession must be a true-to-type genetic identity, be accurately labeled and have intact database records.
As part of the PhilFIDA’s project, abaca germplasm is being characterized phenotypically. However, the phenotypic characteristics appear to be influenced by environmental conditions and geographic location where they are grown. Evolutionary processes driven by environment factors that are influenced by geographical and physical differences have caused changes in some morphological characteristics but the genotype remains unchanged [16]. Moreover, the environmental effects on phenotypic traits can be confused by somaclonal mutations, which have been commonly reported in vegetative propagated crops, including other Musa species. Subsequently, both molecular and phenotypic characterizations are needed to accurately identify abaca genetic resources in both genebanks and farmer’s fields [8].
Molecular markers have been proposed for the identification of mislabeling, parentage and sibship analysis for quality control in breeding and seeds programs, and characterization of farmer selections of new varieties for abaca production [8]. However, published research on molecular characterization of abaca germplasm is limited. Boguero et al. [17], used six SSR markers to genotype 57 abaca accessions and could identify resistant and susceptible accessions for bunchy top virus, thus established a genetic pool of germplasm for breeding resistance to bunchy top virus.
Single nucleotide polymorphisms (SNPs) are a highly abundant class of DNA sequence polymorphisms found in plant genome [18]. SNP markers have become the preferred method for accurate genotype identification in tropical crops, as recently demonstrated in tea [19], Pumelo [20], longan [21], pineapple [22] and coffee [23]. The development of the draft genomes of several Musa species, including the diploid banana [24], paved the way to the development of putative SNP markers using next generation sequencing [26,27]. The online database, Banana Genome Hub http://banana-genome-hub.southgreen.fr/, provides possible tools for other related Musa species, including abaca.
The objective of the study is to adapt a set of SNP markers from banana and validate the efficiency of the SNP markers for abaca genotype identification. We conducted a pilot study to evaluate a set of banana SNP markers for abaca genotyping using a nanofluidic array. The cross-species transfer of nuclear SNP markers, as well as the genotyping method, will be useful for intellectual property rights in cultivar protection, germplasm management, and genetic improvement of abaca.
MATERIALS AND METHODS
Abaca leaf samples collection and DNA extraction
A total of 62 abaca accessions, most of which were landraces and farmer selections, were used in this pilot study (Table 1). These abaca accessions were collected from the abaca germplasm repository maintained by the Philippines Fiber Industry Development Authority at Diliman, Quezon City. Young and healthy leaf samples were harvested and dried in silica gel and DNA was extracted from dried abaca leaves with the DNeasy Plant Mini kit (Qiagen Inc., Valencia, CA, USA). The dry leaf tissue was placed in a 2-mL microcentrifuge tube with oneJ-inch ceramic sphere and 0.15 g garnet matrix (Lysing Matrix A; MP Biomedicals. Solon, OH, USA). The leaf samples were disrupted by high-speed shaking in a TissueLyser II (Qiagen Inc.) at 30 Hz for 1 min. Lysis solution, along with RNase A, was added to the powdered leaf samples and the mixture was incubated at 65 C, as specified in the kit instructions. The remainder of the extraction method followed manufacturer’s suggestions. DNA was eluted from the silica column with two washes of 50 mL Buffer AE, which were pooled, resulting in 100 mL DNA solution.
DNA concentrations were determined by measuring absorbance at 260 nm, using a NanoDrop spectrophotometer (Thermo Scientific™, Wilmington, DE, USA). DNA purity was estimated by the 260?280 ratio and the 260?230 ratio of absorbance maximums.
SNP markers and genotyping
All putative SNP markers were downloaded from Banana Genome Hub http://banana-genome-hub.southgreen.fr/. A total of 384 putative SNPs were selected based on genome distribution, with the number of SNPs ranged from 33 to 36 in each of the 11 chromosomes. The 384 SNP sequences were submitted to the Assay Design Group at Fluidigm Corp. (South San Francisco, CA, USA) for final design and synthesis for the development of the SNP type genotyping panel.
The validation assays were based on competitive allelespecific PCR, and they enable bi-allelic scoring of SNPs at specific loci (KBioscience Ltd, Hoddesdon, UK). Specific Target Amplification [28] was performed to enrich SNP sequences in the sample DNAs. Amplified samples were then genotyped using the nanofluidic 96.96 Dynamic ArrayTM IFC (Integrated Fluidic Circuit; Fluidigm Corp., South San Francisco, CA). The architecture, mechanics and analysis of the system using Fluidigm IFCs for SNP genotyping was described by Wang et al. [28]. Endpoint fluorescent images of the 96.96 array were acquired on a Fluidigm EP1TM imager, and the data was recorded and analysed with Fluidigm Genotyping Analysis Software [29]. The data were then exported in Excel format.
Data analysis
Raw data were first analyzed for call rate. Markers with call rate < 90% were removed. Duplicate accessions were identified using pairwise multilocus matching among all individual samples. The program GenAlEx 6.5 [30, 31] was used for computation and samples that fully matched at the tested SNP loci were designated as identical cultivars or clones. After duplicate identification, the redundant samples were removed and descriptive statistics for measuring the informativeness of the SNP markers were calculated based on the remaining distinctive cultivars. Using the same program key descriptive statistics were measured such as, minor allele frequency, observed heterozygosity, expected heterozygosity and Shannon’s information index.
Distance-based multivariate analysis was used to assess the relationship among the individual abaca samples, as well as their relationships with reference samples from the USDA Musa germplasm collection. Pairwise genetic distances were computed using the Distance option and Principal Coordinates Analysis (PCoA), based on the pairwise distance matrix were measured using the GenAlEx 6.5 program [30,31]. Both distance and covariance were standardized. In addition, a cluster analysis using the UPGMA (unweight pair group method with arithmetic mean) method was performed to further examine the genetic relationship among the 62 abaca accessions. First, the distance between individuals was calculated with 100 bootstrapping using the shared proportion of alleles distance measurement in the program Microsatellite Analyser [32]. The resulting distance matrix was used to generate a consensus dendrogram using the program PHYLIP [33]. Thereafter, the dendrogram was visualized using the FigTree program version 1.3.1 [34].
The population structure of the abaca samples was analysed using model-based Bayesian cluster analysis software STRUCTURE v2.3.4 [35]. The admixture model was applied and the number of clusters (K-value), indicating the number of subpopulations set from 1 to 10. The analyses were carried out without assuming any prior information about the genetic group or geographic origin of the samples. Ten independent runs were assessed for each fixed number of clusters (K value), each consisting of 100,000 iterations after a burn-in of 200,000 iterations. The Delta K value [36] was used to detect the most probable number of clusters using the online program STRUCTURE HARVESTER [37]. Permutation was performed using the computer program Clumpp v1.1.1 [38] and the resultant outputs were then visualized using computer program Distruct v1.1 [39].
RESULTS AND DISCUSSION
SNP validation, cultivar identification and descriptive statistics
Out of the 384 SNP markers selected from the Banana genome hub, 258 markers were found monomorphic across the 62 abaca accessions and 32 markers had no amplified products. These markers, together with those had low call rate (<90%), were excluded from data analysis. The final 60 polymorphic SNPs were reliably scored across the validation panel and were used in data analysis. The 60 SNPs and their flanking sequences are listed in Table 2.
An example of SNP profiles for abaca cultivars was presented in Table 3. No duplicates were identified among the 62 abaca accessions by multi-locus matching and all cultivars could be differentiated by the 60 SNP markers. The two homonymous pairs (‘Igit’ and ‘Puti’; Figure 1) were found having distinguished SNP profiles (Table 3).
Descriptive statistics were then computed for the 60 polymorphic SNPs across the 62 abaca accessions and the result is presented in Table 4. The mean information index was 0.442, ranging from 0.143 to 0.693. The observed heterozygosity ranged from 0.016 to 1.00 with an average of 0.265, whereas the mean expected heterozygosity was 0.281ranging from 0.062 to 0.500. The minor allele frequencies of these 60 SNPs ranged from 0.032 to 0.500 with an average of 0.196 (Table 4).
The validation result demonstrated that the set of 60 SNP markers was effective for the assessment of genetic identity of abaca germplasm. All 62 abaca accessions can be clearly differentiated based on the 60 SNPs (Table 3). No duplicates were found in the present study. However, homonymous mislabeling was identified in two pairs of abaca cultivars. These accessions shared same name but were collected from different regions in the Philippines. Morphologically, it is difficult to differentiate them (Figure 1). In these cases, SNP genotyping provides clear evidence on which re-naming procedure can be taken for these accessions. The present result also showed that three interspecific hybrids and one M. balbisiana sample were possibly mislabeled in terms of their pedigree and taxonomy status.
The approach of cross-species adaptation enabled us to generate high-quality SNP profiles for abaca cultivar identification. However, the overall success rate of using banana SNPs is relatively low. Out of the 384 validated SNPs, only 60 (15.6%) met the requirement as a genotyping panel. This was likely due to the large genetic difference between abaca and banana. The lacking of a draft genome in abaca is another hurdle to the effective development of SNP markers for abaca at the present time. For marker assisted breeding, many more SNP markers will be needed. The draft-genome of abaca is currently underway (Galvez, unpublished data), which will enable large scale development of SNP markers through NGS technology. Nonetheless, this validated set of SNPs is highly useful for abaca genotype identification, germplasm management and certification of planting materials, which will all contribute to more efficient crop improvement and crop production.
The result of cluster analysis is fully consistent with that of PCoA. Two deeply separated clusters were revealed in the UPGMA dendrogram (Figure 3). Six accessions with known hybrid origin, including ‘Canton’, ‘Daratex’, ‘Musa Tex 82’, Hybrid 1, Hybrid 2 and ‘Mamakaw’, were grouped in a small cluster, demonstrating that they had different genetic background than the M. textilis. Again, the UPGMA did not separate the other three inter-specific hybrids (‘Agpas’, ‘Mi-NC’ and ‘MTP’) and one M. balbisiana sample (‘Lacatan’) from the rest of the M. textilis samples, which further support the possible mislabeling in this collection.
Population stratification of the 62 abaca accessions, based on ΔK value computed by STRUCTURE HARVESTER [37], revealed two clusters as the most probable number of K (Figure 3) and the partition was largely consistent with the principle coordinate analysis (Figure 2). Out of the 62 abaca accessions, five accessions, including ‘Canton’, ‘Daratex’, ‘Hybrid 1’, ‘Hybrid 2’ and ‘Mamakaw’ were differentiated from the rest of the accessions, demonstrating their exotic genetic background due to interspecific hybridization. However, same as the result of PCoA, the additional three recorded inter-specific hybrids (‘Agpas’, ‘Mi-NC’ and MTP’), as well as one accession of M. balbisiana (‘Lacatan’) were not differentiated from the M. textlis cultivars, showing their mislabelled status. In addition, cultivar ‘Binagakay’, ‘Canarahon’ and ‘Linono’ were found having a partial pedigree contribution from hybrid parents, indicating they were backcrossed progeny of the inter-specific (M. textilis x M. balbisiana) hybrids.
Although the result is only based on 62 abaca accessions, the distance and model based analytical methods both clearly showed that the abaca genepool is heterogeneous. As shown in the PCoA (Figure 3), UPGMA (Figure 4) and Bayesian stratification (Figure 5), there were several ‘outsiders’, which did not belong to the core group of M. textilis. This heterogeneous structure appeared compatible with the breeding history of abaca in the Philippines. Musa textilis is indigenous to the Philippines and wild populations still exist in the highlands [1,2,8]. M. balbisiana is also widely distributed in the Philippines, although Philippines may not be the center of origin of this species [1, 40]. Nonetheless, ‘Pacol’ – a M. balbisiana-type has long been cultivated by subsistence farmers in the Philippines as source of food and fiber [2,8]. It is well documented that both natural hybridization occurred between ‘Pacol’ and M. textilis [1,2]. Despite of their pedigree from M. balbisiana, these inter-specific hybrids were often considered as ‘abaca’, of which ‘Canton’ is a well-known example [2]. Since 1920, numerous hybrids have been developed by various breeding programs through intra-and inter-specific hybridization, with the main objectives to increase productivity and resistance to bunchy top and mosaic viruses [8,13,17]. Therefore, the intensive genetic introgression from M. balbisiana (and possibly from other Musa species as well) explained the broad genetic diversity in the current genepool of abaca.
However, to accurately dissect the ancestries of current abaca germplasm, more SNP markers that can generate polymorphic profile across M. textilis, M. balbisiana and other related species need to be developed. This would require either screening for more SNPs using the present strategy of cross species adaptation or using the method of next generation sequencing (e.g. genotyping-by-sequencing). These general Musa SNP markers could enable effective genotyping of all possible ancestral species of abaca. In addition, SNP profile of M. balbisiana and other related species need to be established at population level, to better quantify the ancestral contribution of M. balbisiana (and possibly other Musa species) to the hybrid abaca germplasm.
Moreover, the national abaca germplasm collection in the Philippines maintained more than 200 abaca accessions and there are many uncollected landraces in farmer fields. The present study only analysed a small fraction of the accessions available in the genebank. The full spectrum of germplasm accessions need to be included to understand the population structure in the primary gene pool of abaca. Additional analytical approaches, such as discriminant analysis of principal components (DAPC) could be applied to provide insight that is complementary to the present study.
The genetic relationship among the analysed abaca samples were presented in the principal coordinates analyses (PCoA) plots (Figure 2A-2B). The three main PCoA axes accounted for 26.3% of the total variation. Although the pattern of grouping was not apparent, it appeared that all the tested 62 accessions could be grouped into two types. The first type was comprised of most cultivars of M. textilis origin, both farmer cultivars and breeding lines. The second cluster is much smaller in size, including hybrids between M. textlis and M. balbisiana such as ‘Canton’, ‘Daratex’, ‘Musa Tex 82’, Hybrid 1, Hybrid 2 and ‘Mamakaw’. However, there were three inter-specific hybrids (‘Agpas’, ‘MiNC’ and ‘MTP’) and one M. balbisiana sample (‘Lacatan’) grouped with the M. textilis accessions, suggesting possible mislabeling for these accessions.
CONCLUSION
Despite the economic importance of abaca in the Philippines, research tool for germplasm management and genetic improvement of abaca is still in the infant stage. Lack of accurate information on genetic integrity is a primary concern for abaca researchers and growers. It’s been difficult in determining a true-to-type cultivar solely based on phenotype, which causes confusion and uncertainty in the use of breeding materials. We conducted a pilot study to evaluate a set of banana SNP markers for abaca genotyping using a nanofluidic array. The cross-species transfer of nuclear SNP markers led to selection of 60 polymorphic SNPs suitable to abaca DNA fingerprinting. The generated SNPs profiles enabled accurate identification of all tested abaca cultivars and detection of homonymous naming mistakes. Cultivars with a background of inter-specific hybrid were differentiated using multi-variant analysis and Bayesian stratification. The result demonstrated that this approach could serve as a shortcut for SNP development in abaca. These selected SNPs are highly useful for downstream applications for abaca industry, including cultivar identification, nursery accreditation and protection of breeder’s right.
ACKNOWLEDGEMENTS
We would like to give special thanks to Stephen Pinney for assisting with SNP genotyping using nanofluidic array, Sue Mischke for editing this manuscript and Brian Irish for valuable discussion. This work was partially supported by the Fulbright Scholar Program. References to a company and/or product by the USDA are only for the purposes of information and do not imply approval or recommendation of the product to the exclusion of others that may also be suitable.
REFERENCES
1. Halos S. The Abaca. Department of Agriculture. Quezon City. 2008; 188.
3. Copeland EB. Nomenclature of the abaca plant. Philipp J Sci. 1927; 33:141-153.
13. Labrador AF. The abaca project of La Carlota Experiment Station. Philippine Agr Rev. 1928; 1: 3-19.
16. Jeffries MJ. Biodiversity and Conservation. Routledge, London and New York. 1997; 208.
32. Felsenstein, J. PHYLIP—phylogeny inference package (version 3.2). Cladistics 1989; 5: 164-166.
33. Rambaut A. Molecular evolution, phylogenetics and epidemiology: FigTree. 2006-2009.