Current Status of Genomics Research on Mycotoxigenic Fungi
- 1. USDA-ARS, Beltsville Agricultural Research Center, Food Quality Laboratory, USA
- 2. Department of Plant Biology and Pathology, Rutgers University, USA
Abstract
Mold-produced secondary metabolites that are toxic and carcinogenic are termed mycotoxins. They are biosynthesized in a number of fungi, mainly from species in the Aspergillus, Fusarium and Penicillium genera. Mycotoxins contaminate agricultural commodities such as grains, fruits and nuts. Due to their toxic and carcinogenic properties, they pose a serious health hazard to animals and humans and cause staggering economic losses to growers, packers, processors, and consumers annually. Research on major mycotoxins, such as aflatoxins, using molecular biological and genetic tools has uncovered the genes, gene clusters, biosynthetic pathways, and genetic regulatory mechanisms involved in their formation. The field of genomics has empowered scientists with a high throughput tool to study mycotoxin biosynthesis and regulatory networks with a new level of scientific rigor. In this paper, the current status of genomic investigations on mycotoxigenic fungi has been summarized in order to better understand their biosynthesis, genetic regulation, genome structure, and evolutionary aspects. In addition, the advantages, challenges, and future perspectives in studying mycotoxins are discussed. The information and knowledge contained in this chapter may guide possible solutions to abate mycotoxin contamination of agricultural commodities for human consumption and animal feed.
Citation
Yu J, Jurick WM, Bennett JW (2015) Current Status of Genomics Research on Mycotoxigenic Fungi. Int J Plant Biol Res 3(2): 1035.
Keywords
• Mycotoxins
• Aflatoxins
• Patulin
• Aspergilli
• Penicillium
• Fusarium
ABBREVIATIONS
NGS: Next Generation Sequencing; CPA: Cyclopiazonic Acid; DON: Deoxynivalenol; ST: Sterigmatocystin; DHST: Dihydrosterigmatocystin; ORF: Open Reading Frames; FFSC: Fusarium (Gibberella) fujikuroi Species Complex; OTA: Ochratoxin A; SMURF: Secondary Metabolite Unique Region Finder; Antismash: Antibiotics And Secondary Metabolite Analysis Shell; Aspgd: Aspergillus Genome Database.
INTRODUCTION
Mycotoxins and mycotoxigenic fungi
Fungi are diverse and complex life forms. They play an essential role in carbon and nitrogen recycling by breaking down organic matter, especially plant biomass (i.e. leaf litter). The fungal kingdom also includes harmful pathogens that cause diseases of plants, animals, and humans. Some fungi are valued as gourmet foodstuffs that regularly appear on our dining tables (for example, edible basidiomycetes, truffles, and certain mold fermented foods, i.e. camembert cheese) Finally, numerous fungal species are capable of producing a broad array of chemically diverse secondary metabolites (SM) [1]. Some of the SM produced by fungi are beneficial and have useful pharmaceutical properties, such as antibiotics and other compounds used as drugs [2]. For example, Penicillium chrysogenum produces penicillin, a well-known broad spectrum antibiotic drug that has saved thousands of lives since World War II. Aspergillus terreus produces lovastatin, a potent cholesterol-lowering drug. Other Aspergillus species secrete antibiotics (cephalosporin), antifungals (griseofulvin), and anti-tumor drugs (terrequinone A) [1,3] , while a number of SM are toxic and carcinogenic to animals and humans [1,2,4]. Well-studied mycotoxins include aflatoxins, ochratoxins, sterigmatocystins, cyclopiazonic acid (CPA), kojic acid, patulin, citrinin, fumonisins, trichothecenes, deoxynivalenol (DON) toxins, T-2 toxin, and zearalenone toxin [5,6]. These toxins are mostly produced by Aspergillus, Fusarium, and Penicillium spp., although other fungal genera are also implicated.
Food and feed contamination by mycotoxins, especially by aflatoxins, fumonisins, trichothecenes, ochratoxins, and patulin, are a significant food safety issue in developing countries because of the lack of detection, monitoring and regulations to safeguard the food supply. It is estimated that approximately 4.5 billion people living in developing countries are chronically exposed to uncontrolled amounts of aflatoxin that results in negatives changes in immune and nutritional status [7]. Major outbreaks of acute aflatoxicosis from contaminated food in humans have been documented [8]. For example, in western India in 1974, 108 persons among 397 people affected died from aflatoxin poisoning [9]. A more recent incident of aflatoxin poisoning occurred in Kenya in July 2004 leading to the death of 125 people among the 317 reported illnesses due to consumption of aflatoxin contaminated corn [8,9]. Due to their toxic and carcinogenic effects, aflatoxins have received a lot of attention from the research community and the aflatoxin biosynthetic pathway is one of the best studied fungal secondary metabolic pathways [10-12].
The number of uncharacterized secondary compounds produced by fungi via various metabolic pathways is unknown. These include pathway end products and intermediates or shunt metabolites formed along these pathways. These uncharacterized compounds are likely to include many new SM that will have beneficial pharmaceutical properties that can be explored for potential drug discovery, while others may possess toxic or carcinogenic properties. To maximize the likelihood of discovering new drugs and to minimize the harmful effects of mycotoxins for food safety, uncharacterized SM clusters are the subject of investigations by scientists worldwide.
Genomics approaches to study mycotoxins and mycotoxigenic fungi
Technical breakthroughs in DNA sequencing have provided a high throughput tool to study genes and genetics at the genome scale. The availability of Next Generation Sequencing (NGS) technologies allows scientists to sequence a given fungal genome and discover all of the putative functional genes of a genome in a very short time [13-15]. NGS technologies are broadly applied in functional genomics for transcriptome studies. In the post-genomic era, the genomes of numerous biologically and economically important fungi have been sequenced. Comparative genomics analysis of related toxigenic fungal species has revealed a vast array of information concerning mycotoxin biosynthetic pathways, pathway genes, gene clusters, genomic organization, and their evolution. Genome sequencing data enrich our knowledge of the evolutionary status and phylogenetic relationships of related fungal species. It is expected that the genomic data accumulated over the years, and the accompanying knowledge gained through functional genomic studies, can be translated into biotechnological strategies for preventing mycotoxin contamination in food and feed.
DISCUSSION AND CONCLUSION
Aspergillus toxins and genomics
Mycotoxins have very diverse chemical structures, toxic effects, and biological activities [1649]. At sufficient concentrations, some mycotoxins have acute toxic effects leading to death, while long term exposure to lower concentrations results in chronic effects, such as suppressed immune response, malnutrition, or cancer [17,18]. Among the identified mycotoxins, aflatoxins are the most toxic and potent natural carcinogens. These A. flavus toxins were first identified as the cause of a severe animal poisoning incident in England in 1960 called Turkey X disease [19,20]. Most strains of A. flavus produce aflatoxin B1 and B2 whereas the closely related species, A. parasiticus produces aflatoxins B1 , B2 , G1 , and G2 . Further, aflatoxin M1 is a hydroxylated derivative metabolized from aflatoxin B1 by cows and secreted in milk [18]. In addition to aflatoxins, A. flavus also produces many other mycotoxins such as cyclopiazonic acid (CPA), kojic acid, beta-nitropropionic acid, aspertoxin, aflatrem and aspergillic acid [21]. Sterigmatocystin (ST) or dihydrosterigmatocystin (DHST), the penultimate precursors of aflatoxins, are produced by several species including Aspergillus versicolor and Aspergillus nidulans. Although somewhat less toxic and carcinogenic than aflatoxins, the sterigmatocystins produced by A nidulans and A. versicolor share common biochemical pathways, homologous genes, and regulatory mechanisms to aflatoxins [12,22]. In A. flavus and A. parasiticus a complete aflatoxin pathway gene cluster consisting 30 genes or open reading frames (ORFs) has been confirmed within an 80 kb DNA sequence [12].
Aspergillus genomics was initiated during the late 20th century by coordinated international efforts to sequence three genomes: the medically important Aspergillus fumigatus [23], the biological model A. nidulans [24], and the industrially important fungus A. oryzae [32]. The papers describing these three genomes were published concurrently in Nature in 2005. In the early 21st century, A. flavus (strain: NRRL 3357) whole genome sequencing was initiated in order to study aflatoxin biosynthesis and genetic regulation for food safety, and the sequencing was completed in 2005 [26]. Primary assembly indicated that the A. flavus genome consists of 8 chromosomes and a genome size of about 36.8 Mb. Preliminary annotation demonstrated that there are 13,485 functional genes in the A. flavus genome, a number similar to those of other Aspergillus species [23,24,27,28]. Using the Secondary Metabolite Unique Region Finder (SMURF) program [29], fifty-six (56) SM gene clusters were identified in A. flavus and their relative physical locations in the genome were determined. The aflatoxin gene cluster is located on chromosome III near a sub-telomeric region [30,31]. The sequence data have been deposited in the NCBI GenBank database (http://www. ncbi.nlm.nih.gov) under WGS accession AAIH02000000 and the Genome Announcement was submitted to ASM [32]. The data are also available through the A. flavus website (http:// www.aspergillusflavus.org), Aspergillus Comparative Database of The Broad Institute at MIT (http://www.broadinstitute. org/annotation/genome/aspergillus_group/MultiHome.html), and Central Aspergillus Data Repository in the United Kingdom (http://www.cadre-genomes.org.uk/aspergillus_links.html).
Comparative genomics studies of the aflatoxin-producing A. flavus strain NRRL 3357 indicated that it is very similar to A. oryzae in genome size (36.7 Mb) and the number of predicted genes (12,079). Both genomes are enriched in genes for SM. The A. flavus and A. oryzae genomes are predicted to have 35 vs. 30 polyketide synthases, 24 vs. 24 non-ribosomal peptide synthases, and 122 vs. 151 P450 enzymes, respectively. There are 255 genes unique to A. flavus and 299 genes unique to A. oryzae.
Genomes of several additional aflatoxin-producing Aspergillus strains also have been sequenced. A. parasiticus is a soil-born pathogen that infects peanut and produces large amounts of aflatoxins (B1 , B2 , G1 , and G2 ). The A. parasiticus strain SU-1 genome has been recently sequenced [33]. The A. parasiticus SU-1 genome is estimated to be about 39 Mb in size and predicted to consist of 8 chromosomes with similar sizes to those of A. flavus 3357. Although, the SU-1 genome size (39Mb) is 2 Mb larger than A. flavus NRRL 3357 (36.8 Mb), the number of functional genes in both genomes is similar (13,290 and 13,485 genes respectively). About 4% of the A. flavus 3357 ORFs have no detectable homolog in the A. parasiticus SU-1 genome [33]. Both genomes share greater than 90% sequence identity over more than 90% of the genome, and both genomes contain about fifty-six SM gene clusters as detected by SMURF [29]. In an independent effort, A. parasiticus SU-1 and another aflatoxin-producing A. flavus strain #70, also have been sequenced. The sequence data are under annotation and analysis (Yu et al, unpublished). A. flavus #70 produces small sclerotia and large amounts of aflatoxins as compared with the previously sequenced A. flavus NRRL 3357 which produces large sclerotia. A. flavus strain #70 is highly virulent and causes infection in cotton bolls. Future studies will attempt to identify those factors that contribute to its virulence and sclerotial morphology.
The sequence of Aspergillus niger, a member of the black aspergilli that is widely used in biotechnology for the production of food ingredients, pharmaceuticals, and industrial enzymes was published in 2007 [27]. The genome size of A. niger CBS 513.88 was about 33.9 Mb. A total of 14,165 open reading frames were identified and functional predictions were made for 6,506 of these genes. The sequence and annotation data of the above described genomes, and other related Aspergillus genomes, have been curated by the Aspergillus Genome Database (AspGD, http://www.aspgd.org/) [34].
Fusarium toxins and genomics
The genus Fusarium is another widespread group of filamentous fungal species. Some species of Fusarium produce toxigenic SM, including fumonisins and trichothecenes. These Fusarium mycotoxins have the potential to contaminate grains and animal feeds worldwide, requiring the surveillance of international agencies [35].The economic impact of these mycotoxins on health costs and their effect on international trade is estimated to be in the hundreds of millions of dollars annually [36].
Fumonisins are a family of mycotoxins including fumonisin B1 , B2 , B3 , B4 , A1 , A2 , C1 , C3 etc. [16]. Fumonisin-producing species are members of the Fusarium (Gibberella) fujikuroi complex (FFC) [37]. These include F. moniliforme, F. proliferatum, F. verticillioides and F. oxysporum [37]. These fungi are found in soils across the world and have the potential to infect crops, particularly maize (corn) posing an enormous threat to the health of humans and our domesticated animals [38]. Fumonisin B1 has a chemical structure similar to that of sphonganine and sphingosine, both of which are important substrates in sphingolipid metabolism. Fumonisin B1 disrupts sphingolipid metabolism by interfering with its biosynthesis via competitive inhibition of ceramide synthase which thereby blocks the conversion of sphingolipids to ceramides. Human exposure occurs most commonly in populations where maize is the dietary staple. In addition to a suspected association with neural tube defects, fumonisin exposure has been correlated with higher levels of cancer, especially esophageal and liver cancer [39,40]. The fumonisin biosynthetic pathway and cognate loci consist of at least 16 genes [37].
Another major group of mycotoxins produced by Fusarium species is the trichothecenes. Deoxynivalenol (DON) and T-2 toxin are the branched products of the trichothecene pathway. They are commonly produced by at least 24 Fusarium species including F. equisiti, F. graminearum. F. moniliforme and F. sporotrichioides [41]. Seven additional fungal genera are also reported to produce trichothecenes [16]. It is reported that trichothecene biosynthesis involves at least 12 genes [42].
Genome sequencing of mycotoxin-producing Fusarium species started with Fusarium verticilioides Expressed Sequence Tags (EST) reported at a fungal genomics workshop in 2002. The genomes of F. verticillioides [43], F. graminearum [44], F. fujikuroi IMI [45], and F. oxysporum [43] have been sequenced and reported. Comparative genomics studies revealed the genes, gene clusters, and cluster evolution responsible for fumonisin and trichothecene biosynthesis [46]. The whole genome sequence data has facilitated the identification of complete gene clusters involved in SM, pigments, and mycotoxins [37,47].
Penicillium toxins and genomics
Approximately one hundred Penicillium species are capable of producing mycotoxins, however the majority of these are not commonly found in food commodities. Nevertheless, three major mycotoxins produced by Penicillium spp. are a food safety concern to human and animals: ochratoxin A (OTA), patulin, and citrinin [48]. P. verrucosum and P. nodicum as well as A. ochraceus produce OTA. Patulin is produced by a number of species belonging to both Aspergillus and Penicillium [49], however the main producers of patulin are P. expansum species that cause postharvest decay of apple and pears [48,50]. In addition, P. expansum produces citrinin, penicillic acid, penitrem A, and rubratoxin B. P. citrinum is the main producer of citrinin. Genetic and genomic studies on the biosynthesis of these toxins have significantly lagged behind that of aflatoxins and trichothecenes. The biosynthetic pathways of OTA and the genes involved were identified to consist of 3 genes in a 10 kb DNA region in P. nordicum [51]. The patulin biosynthetic pathway is chemically well-characterized. A putative patulin gene cluster was first reported by genome sequencing of non-producing strains of A. fumigatus and A. clavatus [52,53].
The genome sequencing of species in the genus Penicillium was initially started for those with pharmaceutical and industrial value which include: the penicillin-producing P. chrysogenum [54]; the main postharvest pathogen of citrus, P.digitatum [55]; the lignocellulose-degrading P. oxalicum (P.decumbens) [56]; two cheese-related Penicillium species, P. roqueforti and P. camemberti [57], and the endophytic fungal species, P. aurantiogriseum [58]. A strain of P. expansum was sequenced recently in order to learn more about the genes involved in patulin biosynthesis [59]. The patulin gene cluster was first identified in Aspergillus clavatus by whole genome sequencing [53] and predicted by SMURF [29] to consist of 15 genes in the following order: patH, patG, patF, patE, patD, patC, patB, patA, patM, patN, patO, patL, patI, patJ, and patK [53]. The functions of two of the patulin pathway genes encoding for cytochrome P450 type enzymes in A. clavatus were characterized [52]. The patulin gene cluster in P. expansum also was reported [60]. These genes share 60-70% sequence identity to those in Aspergillus clavatus [52,53] although their gene order was different. To understand the pathogenicity and mycotoxin biosynthesis of the blue mold fungus that causes postharvest decay of pome fruit [61],the most virulent and economically significant strain in our collection, P. expansum R19 was sequenced and compared with the less virulent strain P. solitum RS1. The calculated genome size of P. expansum (R19) contained 31,415,732 bps [59]. This is consistent with the genome size reported previously for P. chrysogenum [54]. Preliminary annotation demonstrated that the P. expansum R19 genome harbors 10,554 predicted genes with an average gene length of 1,599 bp. There are 120 tRNA genes and 48 5S rRNA genes respectively. It is estimated that there are 59 gene clusters putatively involved in the biosynthesis of SM as predicted by SMURF[29]. This is similar to the result predicted by the AntiSMASH program, the Antibiotics and Secondary Metabolite Analysis Shell [62], which resulted in 57 clusters. Genes that are putatively involved in spore germination, mycelial growth, and mycotoxin biosynthesis, specifically patulin and citrinin, are under investigation [59]. The genome sequence of P. expansum R19 has been deposited at DDBJ/EMBL/GenBank under the accession JHUC00000000. The version described in this paper is version JHUC01000000. Transcriptome studies on P. expansum and comparison with related non-patulin -producing strains also have been reported [63]. The recently published genomes of Penicillium species [64] indicated that P. expansum contain 55 SM gene clusters.
Advantages and limitations of genomic technologies
The expanding list of sequenced genomes provides new insights into fungal biology, mycotoxin biosynthesis, genetic regulation, pathogenicity, phylogenetic relationships and evolution. The availability of whole genome sequence data makes it possible to predict all of the genes in the genome, and in general the SM pathway genes tend to be grouped together as a cluster [65,66]. Based on the characteristic sequence signature of SM backbone genes, several software tools were developed to rapidly predict SM gene clusters in a given genome. Secondary Metabolite Unique Region Finder (SMURF) was the first software developed [29], followed by AntiSmash program [62]. The two programs give different predictions. For example, in the genomes of A. flavus and A. parasiticus, 56 SM gene clusters were predicted using the SMURF program; when using the AntiSMASH program, about 70 SM gene clusters were predicted, of which some are remnant and incomplete. For each fungal species, the number of SM gene clusters predicted based on genome sequence data is far greater than the number of known SM compounds that have been chemically identified --almost 10 times greater. Therefore, it can be deduced that most SM gene clusters are silent under normal laboratory conditions. It is hypothesized that these clustered SM genes are expressed only under very specific conditions (i.e. temperature, pH, nutrition, biological niche or in a competing situation with other microorganisms) [2]. For that reason, a third SM prediction algorithm, MIDAS-M [66] was developed based on gene expression patterns detected by EST, microarray or RNASeq data. The advantage of MIDAS-M is that this program detects only those SM gene clusters that are expressed. Nevertheless, a major challenge remains in distinguishing functional SM gene clusters from pseudo, nonfunctional, incomplete, or remnant SM gene clusters. This challenge remains a major bottle neck in the ‘post-genomic’ era and will require major technological advances in the functional genetic and/ or mutational analysis arenas to conclusively demonstrate their function.
FUTURE PERSPECTIVE
It has been almost two decades since Professor Joan W. Bennett suggested that fungal biologists should create a “wish list” for fungal genome sequences [4]. Since then, technological breakthroughs (i.e. Next Generation Sequencing) greatly have increased the speed and lowered the cost of sequencing a fungal genome. In fact, the cost per reaction of DNA sequencing has fallen with a Moor’s Law [67]. In order to study fungal biology and evolution, to address important problems associated with energy and the environment, and to explore novel SMs for the pharmaceutical drug discovery, The Joint Genome Institute (JGI) has geared up to sequence 1000 fungal genomes. Currently, the genomes of at least 24 Aspergillus, four Fusarium, and thirteen Penicillium species are curated at JGI MycoCosm (http://genome. jgi-psf.org/programs/fungi/index.jsf). The availability of hundreds of fungal genomes in public databases is a reflection of the significant progress of the field. Moreover, these growing fungal genomics resources will help us to decipher the genes and pathways regulating both mycotoxins and virulence and to learn more about the genes that affect evolutionary adaptability. The RNA-Seq technology has been employed to characterize fungal transcriptomes and to reveal quantitative differences in gene expression between the environmental conditions analyzed. It is anticipated that a high resolution view of entire fungal transcriptomes will allow researchers to identify genes differentially expressed under conditions conducive and nonconducive for mycotoxin production. With the rapid progress in fungal genomics, a vast amount of new information on gene function, genetic regulation and signal transduction will be amassed. It is now time to integrate the brute force approach of combining the gene-by-gene strategy that was so fruitful in the late 20th and early 21st centuries with the mature whole genome approach of the 21st century. It has become apparent that we must stop thinking about the parts and start thinking about the whole. New forms of systems analyses will allow us to understand the incredibly complex interactions between fungal SM and an ever-changing environment. The genetic and genomic resources will significantly enhance our understanding of the mechanisms of mycotoxin production, pathogenicity, and crop-fungal interactions. This information is vital in assisting scientists to discover new pharmaceutical drugs and for devising novel strategies to eliminate mycotoxin contamination thereby resulting in a safer, nutritious and sustainable food and feed supply to nourish a growing planet.
ACKNOWLEDGEMENTS
Use of a company or product name by the U.S. Department of Agriculture does not imply approval or recommendation of the product to the exclusion of others that may also be suitable. This research was made possible by USDA-ARS project no. 1275- 42430-014-00D via National Program 303 Plant Disease.