Natural products in drug discovery: advances and opportunities - Nature
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Reviews Natural products in drug discovery: advances and opportunities Atanas G. Atanasov 1,2,3,4 ✉, Sergey B. Zotchev2, Verena M. Dirsch 2 , the International Natural Product Sciences Taskforce* and Claudiu T. Supuran 5 ✉ Abstract | Natural products and their structural analogues have historically made a major contribution to pharmacotherapy, especially for cancer and infectious diseases. Nevertheless, natural products also present challenges for drug discovery, such as technical barriers to screening, isolation, characterization and optimization, which contributed to a decline in their pursuit by the pharmaceutical industry from the 1990s onwards. In recent years, several technological and scientific developments — including improved analytical tools, genome mining and engineering strategies, and microbial culturing advances — are addressing such challenges and opening up new opportunities. Consequently, interest in natural products as drug leads is being revitalized, particularly for tackling antimicrobial resistance. Here, we summarize recent technological developments that are enabling natural product-based drug discovery, highlight selected applications and discuss key opportunities. sp3 carbon atoms Historically, natural products (NPs) have played a key ‘bioactive’ compounds covering a wider area of chemical Tetravalent carbon atoms role in drug discovery, especially for cancer and infec- space compared with typical synthetic small-molecule forming single covalent bonds tious diseases1,2, but also in other therapeutic areas, libraries13. with other atoms within the including cardiovascular diseases (for example, statins) Despite these advantages and multiple successful molecular structure. A higher fraction of sp3 carbons within and multiple sclerosis (for example, fingolimod)3–5. drug discovery examples, several drawbacks of NPs have molecules is a descriptor that NPs offer special features in comparison with con- led pharmaceutical companies to reduce NP-based drug indicates more complex ventional synthetic molecules, which confer both advan- discovery programmes. NP screens typically involve a 3D structures. tages and challenges for the drug discovery process. library of extracts from natural sources (Fig. 1), which 1 Institute of Genetics and NPs are characterized by enormous scaffold diversity may not be compatible with traditional target-based Animal Biotechnology of the and structural complexity. They typically have a higher assays14. Identifying the bioactive compounds of inter- Polish Academy of Sciences, molecular mass, a larger number of sp3 carbon atoms and est can be challenging, and dereplication tools have to Jastrzebiec, Poland. oxygen atoms but fewer nitrogen and halogen atoms, be applied to avoid rediscovery of known compounds. 2 Department of higher numbers of H-bond acceptors and donors, lower Accessing sufficient biological material to isolate and Pharmacognosy, University of Vienna, Vienna, Austria. calculated octanol–water partition coefficients (cLogP characterize a bioactive NP may also be challenging15. 3 Institute of Neurobiology, values, indicating higher hydrophilicity) and greater Furthermore, gaining intellectual property (IP) rights Bulgarian Academy of molecular rigidity compared with synthetic compound for (unmodified) NPs exhibiting relevant bioactivities Sciences, Sofia, Bulgaria. libraries1,6–9. These differences can be advantageous; for can be a hurdle, since naturally occurring compounds 4 Ludwig Boltzmann Institute example, the higher rigidity of NPs can be valuable in in their original form may not always be patented (legal for Digital Health and Patient drug discovery tackling protein–protein interactions10. frameworks vary between countries and are evolving)16, Safety, Medical University Indeed, NPs are a major source of oral drugs ‘beyond although simple derivatives can be patent-protected of Vienna, Vienna, Austria. Lipinski’s rule of five’11. The increasing significance of (Box 1). An additional layer of complexity relates to the 5 Università degli Studi di Firenze, NEUROFARBA drugs not conforming to this rule is illustrated by the regulations defining the need for benefit sharing with Dept, Sezione di Scienze increase in molecular mass of approved oral drugs over countries of origin of the biological material, framed Farmaceutiche, Florence, Italy. the past 20 years12. NPs are structurally ‘optimized’ in the United Nations 1992 Convention on Biological *A full list of members and by evolution to serve particular biological functions1, Diversity and the Nagoya Protocol, which entered into affiliations is presented including the regulation of endogenous defence mech- force in 2014 (ref.17), as well as recent developments con- at the end of the paper. anisms and the interaction (often competition) with cerning benefit sharing linked to use of marine genetic ✉e-mail: a.atanasov. other organisms, which explains their high relevance for resources18. mailbox@gmail.com; claudiu.supuran@unifi.it infectious diseases and cancer. Furthermore, their use Although the complexity of NP structures can be https://doi.org/10.1038/ in traditional medicine may provide insights regarding advantageous, the generation of structural analogues to s41573-020-00114-z efficacy and safety. Overall, the NP pool is enriched with explore structure–activity relationships and to optimize 200 | March 2021 | volume 20 www.nature.com/nrd
Reviews Problem 1 Solutions for 1 and 2 Not possible to culture the • New methods for culturing organism out of the natural • New methods for in situ habitat analysis Problem 6 Solution for 6 Unknown molecular New methods Problem 2 Solution for 2 targets for NPs for molecular Relevant NPs are not produced New methods for NP synthesis identified with mode of action when the organism is taken out induction and heterologous phenotypic assays elucidation of the natural habitat expression of biosynthetic genes Multiple rounds of fractionation Crude Single Organisms Fractions extracts compounds Problem 3 Solution for 3 Bioactivity Presence of NPs that are known New methods for evaluation dereplication Problem 4 Presence of NPs that do not Solution for 4 and 5 have drug-like properties New methods for extraction and Problem 5 pre-fractionation Bioactive NPs are present in of extracts insufficient amounts Lipinski’s rule of five This guideline for the likelihood of a compound having oral bioavailability is based on Fig. 1 | Outline of traditional bioactivity-guided isolation steps in natural product drug discovery. Steps in the several characteristics process are shown in purple boxes, with associated key limitations shown in red boxes and advances that are helping to containing the number 5. address these limitations in modern natural product (NP)-based drug discovery shown in green boxes. The process begins It predicts that a molecule is with extraction of NPs from organisms such as bacteria. The choice of extraction method determines which compound likely to have poor absorption classes will be present in the extract (for example, the use of more polar solvents will result in a higher abundance of polar or permeation if it has more compounds in the crude extract). To maximize the diversity of the extracted NPs, the biological material can be subjected than one of the following to extraction with several solvents of different polarity. Following the identification of a crude extract with promising characteristics: there are pharmacological activity, the next step is its (often multiple) consecutive bioactivity-guided fractionation until the pure >5 H-bond donors and >10 H-bond acceptors; the bioactive compounds are isolated. A key limitation for the potential of this approach to identify novel NPs is that many molecular weight is >500; potential source organisms cannot be cultured or stop producing relevant NPs when taken out of their natural habitat. or the partition coefficient These limitations are being addressed through development of new methods for culturing, for in situ analysis, for NP LogP is >5. Notably, natural synthesis induction and for heterologous expression of biosynthetic genes. At the crude extract step, challenges include products were identified as the presence in the extracts of NPs that are already known, NPs that do not have drug-like properties or insufficient common exceptions at the amounts of NPs for characterization. These challenges can be addressed through the development of methods for time of publication in 1997. dereplication, extraction and pre-fractionation of extracts. Finally, at the last stage, when bioactive compounds are identified by phenotypic assays, significant time and effort are typically needed to identify the affected molecular targets. Dereplication This challenge can be addressed by the development of methods for accelerated elucidation of molecular modes of action, Pharmacological screening of natural product extracts yields such as the nematic protein organization technique (NPOT), drug affinity responsive target stability (DARTS), stable isotope hits potentially containing labelling with amino acids in cell culture and pulse proteolysis (SILAC-PP), the cellular thermal shift assay (CETSA) and an multiple natural products that extension known as thermal proteome profiling (TPP), stability of proteins from rates of oxidation (SPROX), the similarity need to be considered for ensemble approach (SEA) and bioinformatics-based analysis of connectivity (connectivity map, CMAP)23,189–192. further study to identify the bioactive compounds. NP leads can be challenging, particularly if synthetic areas: analytical techniques, genome mining and engi- Dereplication is the process of recognizing and excluding from routes are difficult. Also, NP-b ased drug leads are neering, and cultivation systems. In the concluding sec- further study such hit mixtures often identified by phenotypic assays, and deconvolu- tion, we highlight promising future directions for NP that contain already known tion of their molecular mechanisms of action can be drug discovery. bioactive compounds. time-consuming19. Fortunately, there have been sub- stantial advances20 both in the development of screening Application of analytical techniques Phenotypic assays Assays that rely on the ability assays (for example, harnessing the potential of induced Classical NP-based drug research starts with biological of tested compounds to exert pluripotent stem cells and gene editing technologies) screening of ‘crude’ extracts to identify a bioactive ‘hit’ desired phenotypic changes and in strategies to identify the modes of action of active extract, which is further fractionated to isolate the active in cells, isolated tissues, organs compounds (reviewed previously21–23). NPs. Bioactivity-guided isolation is a laborious process or animals. They offer a complementary strategy Here, we discuss recent technological and scien- with a number of limitations, but various strategies and to target-based assays for tific advances that may help to overcome challenges in technologies can be used to address some of them (Fig. 2). identifying new potential drugs. NP-based drug discovery, with an emphasis on three For example, to create libraries that are compatible NaTure RevIewS | DRug DiScOvERy volume 20 | March 2021 | 201
Reviews Box 1 | Natural products that activate the KEAP1/NRF2 pathway with high-throughput screening, crude extracts can be pre-fractionated into sub-fractions that are more suit- An example of a pathway affected by diverse natural products (NPs) is the KEAP1/NRF2 able for automated liquid handling systems. In addi- pathway. This pathway regulates the expression of networks of genes encoding proteins tion, fractionation methods can be adjusted so that with versatile cytoprotective functions and has essential roles in the maintenance sub-fractions preferentially contain compounds with of redox and protein homeostasis, mitochondrial biogenesis and the resolution of drug-like properties (typically moderate hydrophilicity). inflammation196–199. Activation of this pathway can protect against damage by most types of oxidants and Such approaches can increase the number of hits com- pro-inflammatory agents, and it restores redox and protein homeostasis200. The pathway pared with using crude extracts, as well as enabling more has therefore attracted attention for the development of drugs for the prevention efficient follow-up of promising hits24. and treatment of complex diseases, including neurological conditions such as Metabolomics was developed as an approach to relapsing–remitting multiple sclerosis201 and autism spectrum disorder202. simultaneously analyse multiple metabolites in biologi Dimethyl fumarate (DMF), the methyl ester of the NP fumarate (a tricarboxylic acid cal samples. Enabled by technological developments in (TCA) cycle intermediate that is found in both animals and plants), is one of the earliest chromatography and spectrometry, metabolomics was discovered inducers of the KEAP1/NRF2 pathway203,204. The origins of the development of historically applied first in other research fields, such as DMF as a drug date back to the use in traditional medicine of the plant Fumaria officinalis. biomedical and agricultural sciences2. Advances in the Initially, fumaric acid derivatives were used for the treatment of psoriasis as it was analytical instrumentation used in NP research25,26, cou- thought that psoriasis is caused by a metabolic deficiency in the TCA cycle that could be compensated for by repletion of fumarate205. Despite this erroneous assumption, pled with computational approaches that can generate DMF is effective in treating psoriasis, both topically and orally, and is the active principle plausible NP analogue structures and their respective of Fumaderm, which has been used clinically for several decades in the treatment of plaque simulated spectra27, have also enabled application of psoriasis in Germany. More recently, a DMF formulation developed by Biogen has been ‘omics’ approaches such as metabolomics in NP-based tested in other immunological disorders, with successful phase III trials in multiple drug discovery. Metabolomics can provide accurate infor- sclerosis206,207 leading to its approval by the FDA and EMA in 2013. mation on the metabolite composition in NP extracts, The isothiocyanate sulforaphane, isolated from broccoli (Brassica oleracea)208, is among thus helping to prioritize NPs for isolation, to accelerate the most potent naturally occurring inducers of the KEAP1/NRF2 pathway209 and has dereplication28,29 and to annotate unknown analogues protective effects in animal models of Parkinson210, Huntington211 and Alzheimer212 and new NP scaffolds. Moreover, metabolomics can diseases, traumatic brain injury213, spinal cord contusion injury214, stroke215, depression216 detect differences between metabolite compositions in and multiple sclerosis217. Sulforaphane-rich broccoli extract preparations are being developed as preventive interventions in areas of the world with unavoidable exposure various physiological states of producing organisms and to environmental pollutants, such as China; the initial results of a randomized clinical trial enable the generation of hypotheses to explain them, showed rapid and sustained, statistically significant increases in the levels of excretion and can also provide extensive metabolite profiles to of the glutathione-derived conjugates of benzene and acrolein218, and a follow-up trial underpin phenotypic characterization at the molecular (NCT02656420) also demonstrated dose–response-dependent benzene detoxification219. level30. Both options are very useful in understanding the In a placebo-controlled, double-blind, randomized clinical trial in young individuals molecular mechanisms of action of NPs. (age 13–27 years) with autism spectrum disorder, sulforaphane reversed many of the For metabolite profiling, NP extracts are analysed by clinical abnormalities202; these encouraging findings led to a recently completed clinical NMR spectroscopy or high-resolution mass spectrom- trial in children (age 3–12 years) (NCT02561481; results of the trial are not yet publicly etry (HRMS), or respective combined methods involv- available). An α-cyclodextrin complex of sulforaphane known as SFX-01 (developed by ing upstream liquid chromatography (LC)31,32, such as Evgen Pharma) is being clinically studied for its potential to reverse resistance to endocrine therapies in patients with ER+HER2- metastatic breast cancer (phase II trial completed220) LC–HRMS, which can separate numerous isomers pres- and in patients with subarachnoid haemorrhage (phase II trial NCT02614742 recently ent in NP extracts33. Moreover, such combined methods completed; results not yet publicly available). Currently, a clinical trial of SFX-01 in patients might integrate HRMS and NMR, allowing the simulta- hospitalized with COVID-19 is in its final stages of preparation. neous use of the advantages of both techiques34,35. NMR Finally, the pentacyclic triterpenoids bardoxolone methyl (also known as RTA 402) and analysis of NP extracts is simple and reproducible, and omaveloxolone (RTA 408), which are semi-synthetic derivatives of the NP oleanolic acid, provides direct quantitative information and detailed are the most potent (active at nanomolar concentrations) activators of the KEAP1/NRF2 structural information, although it has relatively low sen- pathway known to date221. These compounds have shown protective effects in numerous sitivity, meaning that it generally enables profiling only animal models of chronic disease222, and are currently in clinical trials for a wide range of major constituents33. The applications of NMR in NP of indications, such as chronic kidney disease in type 2 diabetes, pulmonary arterial research are versatile36 and the technique is used both hypertension, melanoma, radiation dermatitis, ocular inflammation and Friedreich’s ataxia200. Most recently, bardoxolone methyl has entered a clinical trial in patients directly for metabolomics of unfractionated NP extracts hospitalized with confirmed COVID-19 (NCT04494646). and for structural characterization of compounds and O fractions obtained with appropriate separation methods, O O S most often LC. HRMS is the gold standard for qualita- O S C N tive and quantitative metabolite profiling33 and is most O commonly applied in combination with LC. HRMS can Dimethyl fumarate Sulforaphane also be used in the direct infusion mode (called DIMS)37, whereby samples are directly profiled by MS without a chromatography step, or in MS imaging (MSI)38, which O HH O HH O enables determination of the spatial distribution of NPs O within living organisms. HRMS enables routine acqui- N N N H sition of accurate molecular mass information, which F O F together with appropriate heuristic filtering can pro- O O vide unambiguous assignment of molecular formulae H H for hundreds to thousands of metabolites within a sin- Bardoxolone methyl (RTA 402) Omaveloxolone (RTA 408) gle extract over a dynamic range that may exceed five 202 | March 2021 | volume 20 www.nature.com/nrd
Reviews a Bacterial strains isolated from marine sediment from the Image-based phenotypic LC–HRMS-based metabolomics coastal areas of Panama were bioactivity profiling of the data also recorded with the 234 used for the preparation of 234 NP extracts in HeLa cells NP extracts 234 NP extracts O OH R Integration and clustering of the Me N biological and chemical datasets N N revealed 13 unique clusters H O Quinocinnolinomycin A R = Quinocinnolinomycin B R = Quinocinnolinomycin C R = One of the clusters was prioritized for further study, and a combination of LC–MS and NMR Quinocinnolinomycin D R = analysis led to the identification of quinocinnolinomycins A–D b 156 FACs generated containing Selected 56 FACs predicted 56 FACs transformed into the unique BGCs from three to contain uncharacterized heterologous expression host species from the Aspergillus BGCs (i.e. BGCs with no (A. nidulans) and NP extracts genus known product or well- prepared characterized homologue) 15 new metabolites and their Metabolomics data analysed BGCs were characterized through with FAC-Score algorithm NP extracts subjected to combination of gene deletions that filters out signals present untargeted LC–HRMS within the BGCs and additional in host extracts or in more LC–MS and NMR analysis than one FAC strain Fig. 2 | Applications of advanced analytical technologies empowering modern natural product-based drug discovery. a | An illustrative example of the application of liquid chromatography–high-resolution mass spectrometry (LC–HRMS) metabolomics in the screening of natural product (NP) extracts is the work of Kurita et al.58, in which 234 bacterial extracts were subjected to image-based phenotypic bioactivity screening and LC–HRMS metabolomics. Clustering of the resulting data allowed prioritization of promising extracts for further analysis, resulting in the discovery of the new NPs, quinocin- nolinomycins A–D. b | Another illustrative example of LC–HRMS screening of NP extracts is the work of Clevenger et al.85, who obtained novel NP extracts through heterologous expression of fungal artificial chromosomes (FACs) containing uncharacterized biosynthetic gene clusters (BGCs) from diverse fungal species in Aspergillus nidulans. Analysis of the LC–HRMS metabolomics data with a FAC-Score algorithm directed the simultaneous discovery of 15 new NPs and the characterization of their BGCs. orders of magnitude31,39. However, challenges remain in In this respect, the Global Natural Products Social data mining and in the unambiguous identification of (GNPS) molecular networking platform developed the metabolites using various workflows relying on open in the Dorrestein laboratory is an important addition web-based tools40. to the toolbox41. Molecular networking organizes thou- Dereplication of secondary metabolites in bioactive sands of sets of MS/MS data recorded from a given set extracts includes the determination of molecular mass of extracts and visualizes the relationship of the ana- and formula and cross-searching in the literature or lytes as clusters of structurally related molecules. This structural NP databases with taxonomic information, improves the efficiency of dereplication by enabling which greatly assists the identification process. Such annotation of isomers and analogues of a given metab- metadata, which are difficult to query in the literature, olite in a cluster42. The recorded experimental spectra are often compiled in proprietary databases, such as the can be searched against putative structures and their Dictionary of Natural Products, which encompasses corresponding predicted MS/MS spectra generated all NP structures reported with links to their biological by tools such as competitive fragmentation modelling sources (see Related links). However, a comprehensive (CFM-ID)43. Based on such approaches, vast databases experimental tandem mass spectrometry (MS/MS) data- of theoretical NP spectra have been created and applied base of all NPs reported to date does not exist, and a in dereplication44. The GNPS molecular networking search for experimental spectra across various platforms approach has limitations, however, such as better appli- is hindered by the lack of standardized collision energy cability to some classes of NPs than others and the conditions for fragmentation in LC–MS/MS25. uncertainty of structural assignment among possible NaTure RevIewS | DRug DiScOvERy volume 20 | March 2021 | 203
Reviews predicted candidates. Efforts to address such issues are of biosynthetically prolific bacteria, which enabled the ongoing45–47, including overlaying molecular networks of identification of new bioactive polyketides59. large NP extract libraries with taxonomic information to Finally, advances in analytical technologies con- improve the confidence of annotation48. Overall, molec- tinue to support the rigorous structure determination ular networking mainly allows better prioritization of NPs of interest. The progressive development of of the isolation of unknown compounds by strengthen- higher-field NMR instruments and probe technology60,61 ing the dereplication process and elucidating relation- has enabled NP structure determination from very small ships between NP analogues, and rigorous structure quantities (below 10 µg)62,63, which is important, as the elucidation for NPs of interest should not be neglected. available quantities of NPs are often limited. In addi- Another useful platform for metabolite identifica- tion, microcrystal electron diffraction (MicroED) has tion is METLIN49, which includes a high-resolution recently emerged as a cryo-electron microscopy-based MS/MS database with a fragment similarity search technique for unambiguous structure determination function that is useful for identification of unknown of small molecules64 and is already finding important compounds. Other databases and in silico tools such applications in NP research65. The increased resolution as Compound Structure Identification (CSI): FingerID and sensitivity of analytical equipment can also help and Input Output Kernel Regression (IOKR) can be used address problems associated with ‘residual complex- to search available fragment ion spectra, as well as to ity’ of isolated NPs; that is when biologically potent generate predicted spectra of fragment ions not present but unidentified impurities in an isolated NP sample in current databases50. A novel computational platform (which could include structurally related metabolites or for predicting the structural identity of metabolites conformers) lead to an incorrect assignment of structure derived from any identified compound has also been and/or activity66,67. To avoid futile downstream develop- recently reported51, which should increase the searchable ment efforts, Pauli and colleagues recommended that chemical space of NPs. lead NPs should undergo advanced purity analysis at an To accelerate the identification of bioactive NPs in early stage using quantitative NMR and LC–MS67. extracts, metabolomics data can be matched to the bio- logical activities of these extracts52. Various chemometric Genome mining and engineering methods such as multivariate data analysis can correlate Advances in knowledge on biosynthetic pathways for NPs the measured activity with signals in the NMR and MS and in developing tools for analysing and manipulating spectra, enabling the active compounds to be traced in genomes are further key drivers for modern NP-based complex mixtures with no need for further bioassays53–55. drug discovery. Two key characteristics enable the iden- Furthermore, several analytical modules involving dif- tification of biosynthetic genes in the genomes of the ferent bioassays and detection technologies can be producing organisms. First, these genes are clustered in linked to allow simultaneous bioactivity evaluation and the genomes of bacteria and filamentous fungi. Second, identification of compounds present in small amounts many NPs are based on polyketide or peptide cores, (analytical scale) in complex compound mixtures34,35. and their biosynthetic pathways involve enzymes — Metabolomics data can be integrated with data polyketide synthases (PKSs) and nonribosomal peptide obtained by other omics techniques such as transcrip- synthetases (NRPSs), respectively — that are encoded by tomics and proteomics and/or with imaging-b ased large genes with highly conserved modules68. screens. For example, Acharya et al. used this approach ‘Genome mining’ is based on searches for genes to characterize NP-mediated interactions between a that are likely to govern biosynthesis of scaffold struc- Micromonospora species and a Rhodococcus species56. tures, and can be used to identify NP biosynthetic gene In another interesting example, Kurita et al. developed a clusters69–71. Prioritization of gene clusters for further compound activity mapping platform for the prediction work is facilitated by advances in biosynthetic know of identities and mechanisms of action of constituents ledge and predictive bioinformatics tools, which can pro- from complex NP extract libraries by integrating cyto- vide hints about whether the metabolic products of the logical profiling57 with untargeted metabolomics data clusters have chemical scaffolds that are new or known, from a library of extracts58, and identified quinocinno- thereby supporting dereplication72,73. Such predictive linomycins as a new family of NPs causing endoplasmic tools for gene cluster analysis can be applied in combi- reticulum stress58 (Fig. 2a). nation with spectroscopic techniques to accelerate the Analytical advances that enable the profiling of identification of NPs65 and determine the stereochem- responses to bioactive molecules at the single-c ell istry of metabolic products66. Furthermore, to extend level can also accelerate NP-based drug discovery. Irish, genome mining from a single genome to entire genera, Bachmann, Earl and colleagues developed a high- microbiomes or strain collections, computational tools Phylogenomic approach throughput platform for metabolomic profiling of bio- have been developed, such as BiG-SCAPE, which enables The use of genomic data to activity by integrating phospho-specific flow cytometry, sequence similarity analysis of biosynthetic gene clusters, reveal evolutionary relationships. In the context of single-cell chemical biology and cellular barcoding with and CORASON, which uses a phylogenomic approach natural product drug discovery, metabolomic arrays (characterized chromatographic to elucidate evolutionary relationships between gene the use of phylogenomics is microtitre arrays originating from biological extracts)59. clusters74. based on the assumption that Using this platform, the authors studied the single-cell Phylogenetic studies of known groups of talented organisms that have closer evolutionary relationships are responses of bone marrow biopsy samples from patients secondary metabolite producers can also empower more likely to produce similar with acute myeloid leukaemia following exposure to discovery of novel NPs. Recently, a study comparing natural products. microbial metabolomic arrays obtained from extracts secondary metabolite profiles and phylogenetic data in 204 | March 2021 | volume 20 www.nature.com/nrd
Reviews Taxonomic distance myxobacteria demonstrated a correlation between the To transfer a biosynthetic gene cluster of such size, a plat- The distance of compared taxa taxonomic distance and the production of distinct sec- form based on transformation-associated recombination on a constructed phylogenetic ondary metabolite families75. In filamentous fungi, it was (TAR) cloning was developed. This platform enables tree (also known as an likewise shown that secondary metabolite profiles are direct cloning and manipulation of large biosynthetic evolutionary tree). Closer distance of compared taxa closely correlated with their phylogeny76. These organ- gene clusters in S. cerevisiae, maintenance and manipula- indicates a closer evolutionary isms are rich in secondary metabolites, as demonstrated tion of the vector in E. coli, and heterologous expression relationship. by LC–MS studies of their extracts under laboratory of the cloned gene clusters in Actinobacteria (such as conditions77. Concurrent genomic and phylogenomic S. coelicolor) following chromosomal integration88, and analyses implied that even the genomes of well-studied is an alternative to BACs for heterologous expression of organism groups harbour many gene clusters for sec- large biosynthetic gene clusters. ondary metabolite biosynthesis with as yet unknown Heterologous expression has limitations, such as functions78. The phylogeny of biosynthetic gene clusters, the need to clone and manipulate very large genome together with analysis of the absence of known resistance regions occupied by biosynthetic gene clusters and the determinants, was recently used to prioritize members of difficulty of identifying a suitable host that provides all the glycopeptide antibiotic family that could have novel conditions necessary for the production of the corre- activities. This led to the identification of the known sponding NPs. These limitations can be circumvented antibiotic complestatin and the newly discovered cor- by activating biosynthetic gene clusters directly in the bomycin as compounds that act through a previously native microorganism through targeted genetic manip- uncharacterized mechanism involving inhibition of ulations, generally involving the insertion of activating peptidoglycan remodelling79. regulatory elements or deletion of inhibitory elements Many microorganisms cannot be cultured, or tools such as repressors or their binding sites. For example, for their genetic manipulation are not sufficiently a derepression strategy of deleting gbnR, a gene for a developed, which makes it more challenging to access transcriptional repressor in Streptomyces venezuelae their NP-producing potential. However, biosynthetic ATCC 10712 was used by Sidda et al. in the discovery of gene clusters for NPs can be cloned and heterologously gaburedins, a family of γ-aminobutyrate-derived ureas89. expressed in organisms that are well-characterized An example of the activator-based strategy is the consti- and easier to culture and to genetically manipulate tutive expression of the samR0484 gene in Streptomyces (such as Streptomyces coelicolor, Escherichia coli and ambofaciens ATCC 23877, which led to the discovery Saccharomyces cerevisiae) 80. The aim is to achieve of stambomycins A–D, 51-membered cytotoxic glyco- higher production titres in the heterologous hosts sylated macrolides72. Alternatively, silent biosynthetic than in wild-type strains, improving the availability of gene clusters can be activated using repressor decoys90, lead compounds80–82. Vectors that can carry large DNA which have the same DNA nucleotide sequence as the inserts are needed for the cloning of complete NP bio- binding sites for the repressors that prevent the expres- synthetic gene clusters. Cosmids (which can have inserts sion of the clusters. When these decoys are introduced of 30–40 kb), fosmids (which can harbour 40–50 kb) and into the bacteria, they sequester the respective repres- bacterial artificial chromosomes (BACs; which can have sors, and the ‘endogenous’ binding sites in the genome inserts of 100 kb to >300 kb) have been developed83. For remain unoccupied, leading to derepression of the pre- fungal gene clusters, self-replicating fungal artificial viously silent biosynthetic genes and production of the chromosomes (FACs) have been developed, which can corresponding NPs. This approach has been applied to have inserts of >100 kb (ref.84). FACs in combination activate eight silent biosynthetic gene clusters in multi- with metabolomic scoring were used to develop a scal- ple streptomycetes and led to the characterization of a able platform, FAC-MS, allowing the characterization novel NP, oxazolepoxidomycin A90. The repressor decoy of fungal biosynthetic gene clusters and their respec- strategy is simpler, easier and faster to perform than the tive NPs at unprecedented scale85. The application of deletion of genes encoding regulatory factors. However, FAC-MS for the screening of 56 biosynthetic gene clus- it has the same limitation as other approaches that rely ters from different fungal species yielded the discovery on the introduction of recombinant DNA molecules into of 15 new metabolites, including a new macrolactone, cells: it is necessary to develop protocols for efficient valactamide A85 (Fig. 2b). introduction of DNA into the targeted host strain, and Even in culturable microorganisms, many biosyn- the decoy must be maintained on a high-copy plasmid thetic gene clusters may not be expressed under con- to ensure efficient repressor sequestration. ventional culture conditions, and these silent clusters Another approach focused on exchange of regu could represent a large untapped source of NPs with latory elements is based on the CRISPR–Cas9 technol- drug-like properties86. Several approaches can be pur- ogy. The promise of this technique is exemplified in a sued to identify such NPs. One approach is sequencing, recent work by Zhang et al., which demonstrated that bioinformatic analysis and heterologous expression CRISPR–Cas9-mediated targeted promoter introduction of silent biosynthetic gene clusters, which has already can efficiently activate diverse biosynthetic gene clusters led to the discovery of several new NP scaffolds from in multiple Streptomyces species, leading to the produc- cultivable strains87. Direct cloning and heterologous tion of unique metabolites, including a novel polyketide expression was also used to discover the new antibiotic in Streptomyces viridochromogenes91. The CRISPR–Cas9 taromycin A, which was identified upon the transfer technology was also used to knock out genes encoding of a silent 67 kb NRPS biosynthetic gene cluster from two well-k nown and frequently rediscovered anti Saccharomonospora sp. CNQ-490 into S. coelicolor88. biotics in several actinomycete strains, which led to the NaTure RevIewS | DRug DiScOvERy volume 20 | March 2021 | 205
Reviews a DNA isolation from Sequencing Bioinformatics analysis NP chemical specific microorganism/ (genome mining) synthesis environment/microbiota Heterologous Genetic manipulations expression of the native host NP isolation and characterization b Analysed sequences for BGCs Selected a soil sample rich in BGCs Sequenced that code for lipopeptides from a tree branch not associated metagenomes of with calcium-binding motifs with the BGCs for known 2,000 soil samples and clustered to create a calcium-binding antibiotics and phylogenetic tree cloned DNA in a cosmid library O HO O H H N N O N O H O N O HN OH OH O OH Identified a predicted BGC O and transferred into HN O NH O Streptomyces albus using O O transformation-associated H H recombination N N N N H H O OH O O R NH2 Malacidin A R = Me Malacidin B R = Et Malacidins isolated from cultures and their structures elucidated using a Extracts from cultures of S. albus combination of mass spectrometry and harbouring the BGC found to NMR data, supported by bioinformatics show antibacterial activity analysis of the BGC against Staphylococcus aureus c Analysed genomic sequence Chemically synthesized 25 data from human microbiome ‘synthetic–bioinformatic’ for gene clusters predicted Identified 57 unique NP-like compounds predicted to encode large (≥5 residues) NRPS gene clusters to be encoded by the analysed nonribosomal peptides gene clusters OH O O R1 O R2 O H H H H N N N N N N N OH H H H OH O O O O OH OH OH Humimycin A R1 = L-Phe, R2 = L-Val Humimycin B R1 = L-Tyr, R2 = L-Ile Tested the 25 NP-like Identified humimycins as compounds for activity new antibiotics active against a panel of common against methicillin-resistant human commensal and S. aureus pathogenic bacteria 206 | March 2021 | volume 20 www.nature.com/nrd
Reviews ◀ Fig. 3 | Strategies for genome mining-driven discovery of natural products and was derived from a set of ~100 genes through variable natural product-like compounds. a | Genome mining-based approaches to explore peptide processing96. the biosynthetic capacity of microorganisms rely on DNA extraction, sequencing and Some bioactive compounds initially isolated from bioinformatics analysis. The vast majority of microbes from different environments marine organisms might be products of symbionts, and and microbiota communities have not been cultured, and their capacity to produce genome mining can facilitate the characterization of natural products (NPs) was largely inaccessible until recently. In the case of unculturable microorganisms, the bioinformatics analysis step can be followed by either targeted such NPs. For example, it has been shown that bioactive heterologous expression of biosynthetic gene clusters (BGCs) prioritized as being likely compounds from the sponge Theonella swinhoei are pro- to yield relevant new NPs or direct chemical synthesis of ‘synthetic–bioinformatic’ duced by bacterial symbionts97, and characterization of NP-like compounds. b,c | These two approaches are exemplified by the recent discoveries the symbiont ‘Candidatus Entotheonella serta’ using of malacidins (panel b) and humimycins (panel c), respectively93,94. A major strength of the single-cell genomics led to the discovery of gene clus- ‘synthetic–bioinformatic’ approach is that it is entirely independent of microbial culture ters for misakinolide and theonellamide biosynthesis98. and gene expression. Its limitations are the accuracy of computational chemical structure Another example of a marine NP produced by a bacte- predictions and the feasibility of total chemical synthesis. NRPS, nonribosomal peptide rial symbiont is ET-743 (trabectedin), originally isolated synthetase. from the tunicate Ecteinascidia turbinate. A meta-omics approach developed by Rath et al. revealed that the production of different rare and previously unknown producer of this clinically used anticancer agent is the variants of antibiotics that were otherwise obscured, bacterial symbiont ‘Candidatus Endoecteinascidia including amicetin, thiolactomycin, phenanthroviridin frumentensis’99. and 5-chloro-3-formylindole92. Similarly, plant microbiomes also represent a large Approaches that rely on sequencing, bioinformat- reservoir for the identification of novel bioactive NPs ics and heterologous expression can also enable the (such as the antitumour agents maytansine, paclitaxel identification of novel NPs from bacterial strains that and camptothecin, which were initially isolated from have not yet been cultivated (Fig. 3a) . For example, plants and later shown to be produced by microbial Hover et al. searched the metagenomes of 2,000 soil endophytes)100 that can be tapped by genome mining samples for biosynthetic gene clusters for lipopeptides approaches. An illustrative example is a recent work with calcium-binding motifs. This led to the discov- by Helfrich et al. that identified hundreds of novel bio ery of malacidins, members of the calcium-dependent synthetic gene clusters by genome mining of 224 bac antibiotic family, via heterologous expression of a 72 kb terial strains isolated from Arabidopsis thaliana leaves101. biosynthetic gene cluster from a desert soil sample in a A combination of bioactivity screening and imaging mass Streptomyces albus host strain93 (Fig. 3b). However, in spectrometry was used to select a single species for fur- comparison with some of the other above-discussed ther genomic analysis and led to the isolation of a NP with strategies 72,89,90, this metagenome-b ased discovery an unprecedented structure, the trans-acyltransferase approach is more suited to finding new members of PKS-derived antibiotic macrobrevin101. known NP classes rather than discovery of entirely Targeted genetic engineering of NP biosynthetic gene new classes. In another study, Chu et al. developed a clusters can be of high value if the producing organism human microbiome-based approach that identified is difficult to cultivate or the yield of a NP is too low nonribosomal linear heptapeptides called humimycins to allow comprehensive NP characterization. Rational as novel antibiotics active against methicillin-resistant genetic engineering and heterologous expression con- Staphylococcus aureus (MRSA)94 (Fig. 3c). The structure tributed to increase the production of vioprolides, a of the NPs was predicted via bioinformatics analysis of depsipeptide class of anticancer and antifungal NPs in gene clusters found in human commensal bacteria, fol- the myxobacterium Cystobacter violaceus Cb vi35, by lowed by their chemical synthesis. A major strength of several orders of magnitude. In addition, non-natural this innovative approach is that it is entirely independent vioprolide analogues were generated by this approach102. of microbial cultivation and heterologous gene expres- Similarly, promoter engineering and heterologous sion. Nevertheless, there are limitations related to the expression of biosynthetic gene clusters was reported accuracy of computational chemical structure predic- to result in a 7-fold increase in the production of the tions and the feasibility of total chemical synthesis if cytotoxic NP disorazol103, and a 328-fold increase in structures are complex. the production of spinosad, an insecticidal macrolide The genomes of plants or animals can also be mined produced by the bacterium Saccharopolyspora spinosa104. for novel NPs. For example, mining of 116 plant genomes Besides increasing NP yields, targeted gene manip- enabled by identification of a precursor gene for the ulation can also be used to alter biosynthetic pathways biosynthesis of lyciumins, a class of branched cyclic in a predictable manner to produce new NP analogues ribosomal peptides with hypotensive action produced with improved pharmacological properties, such as by Lycium barbarum (popularly known as goji), identi- higher specific activity, lower toxicity and better pharma fied diverse novel lyciumin chemotypes in seven other cokinetics. Such biosynthetic engineering approaches plants, including crops such as soybean, beet, quinoa and depend on a solid understanding of the biosynthetic eggplant95. Genome mining in the animal kingdom is pathway leading to a specific NP, access to the genes exemplified by the work of Dutertre et al., which used specifying this pathway and the ability to manipulate an integrated transcriptomics and proteomics approach them in either the original or a heterologous host. to discover thousands of novel venom peptides from Recent advances in biosynthetic engineering have Conus marmoreus snails96. Proteomics analysis revea enabled faster and more efficient production of NP led that the vast majority of the conopeptide diversity analogues, including the development of methods for NaTure RevIewS | DRug DiScOvERy volume 20 | March 2021 | 207
Reviews accelerated engineering and recombination of modules identification of specific growth factors, allowing expan- of PKS gene clusters105, NRPSs106,107 and NRPS–PKS sion of the number of species that can be successfully assembly lines108, as well as elucidation of mechanisms cultured. This strategy was used by D’Onofrio et al. for for polyketide chain release that are contributing to the identification of new acyl-desferrioxamine sidero- NP structural diversification109,110. Examples of bio phores (iron-chelating compounds) as growth factors synthetic engineering applied to several important NPs produced by helper strains promoting the growth of include the generation of analogues of the immuno previously uncultured isolates from marine sedi- suppressant rapamycin 111, the antitumour agents ment biofilm117,126. The siderophore-assisted growth is mithramycin112 and bleomycin113, and the antifungal based on the property of these compounds to provide agent nystatin114. iron for microbes unable to autonomously produce It should be noted that biosynthetic engineering has siderophores themselves, and the application of this limitations regarding the parts of the NP molecule that approach led to the isolation of previously uncultivated can be targeted for modifications, and the chemical microorganisms126. The development of strategies to groups that can be introduced or removed. Considering cultivate microbial symbionts that produce NPs only the complexity of many NPs, however, total synthesis upon interaction with their hosts can promote access may be prohibitively costly, and a combined approach to new NPs. Microbial symbionts interacting with of biosynthetic engineering and chemical modification insects or other organisms are a highly promising reser- can provide a viable alternative for identifying improved voir for the discovery of novel bioactive NPs produced drug candidates. For example, biosynthetic engineering in a unique ecological context127–130. To stimulate NP may create a ‘handle’ for addition of a beneficial chem- production, culturing strategies can be developed that ical group by synthetic chemistry, as demonstrated for better mimic the native environment of microbial sym- the biosynthetically engineered analogues of nystatin bionts of insects, including the use of media containing mentioned above; further synthetic chemistry modifi- either lyophilized dead insects131 or l-proline, a major cations resulted in compounds with improved in vivo constituent of insect haemolymph132. pharmacotherapeutic characteristics compared with Strategies to mimic the natural environment even amphotericin B115,116. more closely by harnessing in situ incubation in the environment from which the microorganism is sam- Advances in microbial culturing systems pled have been developed, dating back to more than The complex regulation of NP biosynthesis in response 20 years ago with the biotech companies OneCell and to the environment means that the conditions under Diversa. They developed platforms that allowed the which producing organisms are cultivated can have a growth of some previously uncultivated microbes from major impact on the chance of identifying novel NPs87. various environments based on diluting out and sus- Several strategies have been developed to improve the pension in a single drop of medium120,133. More recently, likelihood of identifying novel NPs compared with such strategies have been highlighted by the develop- monoc ulture under standard laboratory conditions ment and application of a platform dubbed the iChip, and to make ‘uncultured’ microorganisms grow in a in which diluted soil samples are seeded in multiple simulated natural environment117 (Fig. 4). small chambers separated from the environment with One well-established approach to promote the identi- a semipermeable membrane134. After seeding, the iChip fication of novel NPs is the modulation of culture condi- is placed back into the soil from which the sample was tions such as temperature, pH and nutrient sources. This taken for an in situ incubation period, allowing the cul- strategy may lead to activation of silent gene clusters, tured microorganisms to be exposed to influences from thereby promoting production of different NPs. The their native environment. The power of this culturing term ‘One Strain Many Compounds’ (OSMAC) was approach was demonstrated by the discovery of a new coined for this approach about 20 years ago118, but the antibiotic, teixobactin, produced by a previously uncul- concept has a longer history119, with its use being routine tured soil bacterium135,136 (Fig. 4a). This platform may be in industrial microbiology since the 1960s120. of great significance for NP drug discovery, given that While OSMAC is still widely used for the identifi- it has been estimated that only 1% of soil organisms cation of new bioactive compounds121,122, this approach have so far been successfully cultured using traditional has limited capacity to mimic the complexities of natu- culturing techniques137. ral habitats. It is difficult to predict the combination of The omics strategies discussed in previous sections cues (which might also involve metabolites secreted by can complement efforts to explore NPs produced upon other members of the microbial community) to which microbial interactions. The application of such a strategy the microorganism has evolved to respond by switching is illustrated in the work of Derewacz et al., who analysed metabolic programmes. To account for such kinds of the metabolome of a genome-sequenced Nocardiopsis interactions, co-culturing using ‘helper’ strains can be bacterium upon co-culture with bacteria of the genera applied123. This can enable the production and identifica- Escherichia, Bacillus, Tsukamurella and Rhodococcus138. tion of new NPs, as illustrated by recent studies in which Around 14% of the metabolomic features found in particular fungi were co-cultured with Streptomcyes co-cultures were undetectable in monocultures, with species124,125. many of those being unique to specific co-culture gen- Study of the molecular mechanisms underlying era, and the previously unreported polyketides ciro- the ability of helper strains to increase the cultivabil- micin A and B, which possess an unusual pyrrolidinol ity of previously uncultured microbes can lead to the substructure and displayed moderate and selective 208 | March 2021 | volume 20 www.nature.com/nrd
Reviews a Extracts from 10,000 iChip An extract from a new iChip device with diluted soil isolates screened for bacterial species, Eleftheria samples incubated in soil to antimicrobial activity terrae, showed good simultaneously grow and against Staphylococcus antimicrobial activity isolate uncultured bacteria aureus against S. aureus O O NH2 O N H Compound isolation from the O E. terrae extract, and structure O O O O HN H H H H H NH elucidation by NMR and N N N N N advanced Marfey’s analysis N N N N O H H H H N NH yielded teixobactin, a new O O O O H antibiotic with activity against OH OH Gram-positive bacteria Teixobactin b Microbial species selected Bioinformatics analysis Antigens representative of the from human oral microbiome identified membrane proteins selected extracellular protein for targeted isolation based with extracellular domains that domains designed, synthesized on genomic sequence data could serve as antigens for and injected into rabbits for antibody development antibody production Three species of Saccharibacterium Oral microbiota samples stained isolated along with their interacting with fluorescently labelled IgG purified from the rabbit Actinobacterium hosts, as well as antibodies, followed by flow serum and fluorescently SR1 bacteria that are members of cytometry and cell sorting to labelled a candidate phylum with no isolate the targeted bacterial previously cultured representatives species Fig. 4 | Application of advanced microbial culturing approaches to identify new natural products. New strategies for isolating previously uncultured microorganisms can enable access to new natural products (NPs) produced by them. a | To recapitulate the effect of complex signals coming from the native environment, microorganisms can be cultivated directly in the environment from which they were isolated. This concept is used with the iChip platform, in which diluted environmental samples are seeded in multiple small chambers separated from the native environment with a semipermeable membrane. The potential of this approach is illustrated by the recent discovery of teixobactin, a new antibiotic with activity against Gram-positive bacteria134,135. b | Another important recent development involves obtaining information from environmental samples using omics techniques such as metagenomics to identify and partially characterize microorganisms present in a specific environment before culturing. An approach relying on such preliminary information was recently used to engineer the capture of antibodies based on genetic information, which resulted in the successful cultivation of previously uncultured bacteria from the human mouth145. This reverse genomics workflow was validated by the isolation and cultivation of three species of Saccharibacteria (TM7) along with their interacting Actinobacteria hosts, as well as SR1 bacteria that are members of a candidate phylum with no previously cultured representatives. cytotoxicity, were identified138. Other examples include such ‘easy to culture’ microbes have already been charac- a ‘culturomics’ approach that combines multiple culture terized, and conventional screening efforts tend to yield conditions with MS profiling and 16S rRNA-based tax- disappointing returns associated with frequent rediscov- onomy to identify prokaryotic species from the human ery of known NPs and their closely related congeners. gut139, and an ultrahigh-throughput screening platform Therefore, culturing strategies aimed at previously unex- based on microfluidic droplet single-cell encapsulation plored (or under-investigated) microbial groups, with and cultivation followed by next-generation sequenc- the potential to produce NPs with entirely new scaffolds ing and LC–MS, which allows investigation of pairwise and bioactivities (such as Burkholderia, Clostridium and interactions between target microorganisms140. The lat- Xenorhabdus) are of high interest141,142. Closthioamide, ter approach enabled identification of a slow-growing the first secondary metabolite from a strictly anaerobic oral microbiota species that inhibits the growth of bacterium, was discovered from Clostridium cellulo- S. aureus140. lyticum by this approach143. Targeted isolation of such Historically early-a dopted microbial culturing species is important, and a genome-guided approach approaches led to a bias reflected in the predominant to achieve this goal has recently been demonstrated discovery of NPs from microorganisms that are easy to for Burkholderia strains in environmental samples144. cultivate (such as streptomycetes and some common fil- Another highly innovative approach to the isolation amentous fungi). As a result, a vast number of NPs from and cultivation of previously uncultured bacteria was NaTure RevIewS | DRug DiScOvERy volume 20 | March 2021 | 209
You can also read