Unveiling potential anticancer drugs through in silico drug repurposing approaches


List of abbreviations

ABPP

Activity-based protein profiling

ADR

Adverse drug reactions

AI

Artificial intelligence

AML

Acute myeloid leukemia

ANN

Artificial neural networks

ATC

Anatomical therapeutic classification

AUC

Area under curve

AUROC

Area under the receiver operating characteristic curve

BBB

Brain–blood barrier

BSCE

Basespace Correlation Engine

CancerDR

Cancer drug resistance

CCLE

Cancer Cell Line Encyclopedia

CCLP

Cancer cell line profiler

CD

Cancer drug

CDK2

Cyclin-dependent kinase 2

CDRscan

Cancer drug response profile scan

CGC

Cancer Gene Consensus

CGP

Cancer Genome Project

CHAT

Cancer Hallmarks Analytics Tool

cis-eQTL

cis-Expression quantitative trait loci

CMap

Connectivity Map

CNA

Copy number alterations

CNN

Convolutional neural network

COSMIC

Catalog Of Somatic Mutations In Cancer

CPTAC

Clinical Proteomic Tumor Analysis Consortium

CRC

Colorectal cancer

CSTA

Clinical and Translational Science Awards

CTD

Comparative Toxicogenomics Database

CTRP

Cancer Therapeutic Response Portal

CWR

Cures Within Reach

DAPPLE

Disease Association Protein–Protein Link Evaluator

DAVID

Database for Annotation, Visualization, and Integrated Discovery

DEG

Differentially expressed genes

DeSigN

Differentially Expressed Gene Signatures

DIRAC

Differential Rank Conservation

DNMT

DNA methyl transferases

DP

Disease proteins

Drug–SE

Drug–side effect

EBI

European Bioinformatics Institute

EHR

Electronic Health Records

EMBL

European Molecular Biology Laboratory

eMERGE

Electronic Medical Records and Genomics

EMT

Epithelial–mesenchymal transition

ER

Estrogen receptor

ESSA

Event sequence symmetry analysis

FAERS

FDA Adverse Event Reporting System

FC

Fold change

FDR

False discovery rate

GC

Gastric cancer

GDC

Genomic Data Common

GDSC

Genomics of Drug Sensitivity in Cancer

GEM

Genome-scale metabolic model

GEO

Gene Expression Omnibus

GPU

Graphics processing unit

GRAIL

Gene Relationships among Implicated Loci

GWAS

Genome-wide association studies

HMDB

Human Metabolome Database

HMP

Human Metabolome Project

HNSCC

Head and neck squamous cell carcinoma

HoC

Hallmarks of cancer

HPA

Human Protein Atlas

HTS

High-throughput screening

IC

Information components

IC 50

Half maximal inhibitory concentration

ICGC

International Cancer Genome Consortium

IE

Information extraction

IntOGen

Integrative Onco Genomics

IR

Information retrieval

JMDC

Japan Medical Data Center

KD

Known drugs

KEGG

Kyoto Encyclopedia of Genes and Genomes

KS

Kolmogorov–Smirnov

LINCS

Library of Integrated Network-Based Cellular Signatures

MDS

Multidimensionality scaling

MedDRA

Medical Dictionary for Regulatory Activities

MeSH

Medical Subject Headings

MGI

Mouse Genome Informatics

ML

Machine learning

MRS

Magnetic resonance spectroscopy

NCATS

National Centre for Advancing Translational Sciences

NCBI

National Center for Biotechnology Information

NCI

National Cancer Institute

NER

Named-entity recognition

NGS

Next-generation sequencing

NHGRI

National Human Genome Research Institute

NIH

National Institutes of Health

NLP

Natural language processing

NMR

Nuclear magnetic resonance spectroscopy

PBMC

peripheral blood mononuclear cell

PCA

Principal component analysis

PCR

Polymerase chain reaction

PDB

Protein Data Bank

PharmGKB

Pharmacogenomics Knowledge Base

PMID

PubMed ID

POS

Parts of speech

PPI

Protein–protein interaction

ReDIReCT

Repurposing of Drugs: Innovative Revision of Cancer Treatment

ReDO

Repurposing Drugs in Oncology

REMC

Roadmap Epigenomics Mapping Consortium

RGES

Reversal Gene Expression Scores

RMSD

Root mean square deviation

RMSE

Root mean square error

ROC

Receiving operating characteristics

ROR

Reporting odds ratios

RPPA

Reverse-phase protein microarrays

SD

Synthetic derivative

SEA

Similarity ensemble approach

SIDER

Side Effect Resource

SMILES

Simplified molecular-input line-entry system

SNP

Single-nucleotide polymorphism

sscMap

Statistically Significant Connection's Map

STITCH

Search tool for interactions of chemical

SVM

Support vector machine

T2DM

Type 2 diabetes mellitus

TCGA

The Cancer Genome Atlas

tINIT

Integrative Network Inference for Tissues

TM

Text mining

TMIC

The Metabolomics Innovation Centre

TNBC

Triple-negative breast cancer

TP

Target proteins

TPU

Tensor processing unit

TSDS

Topological Score of Drug Synergy

TTD

Therapeutic Target Database

UCSC

University of California, Santa Cruz

UMLS

Unified Medical Language System

VUMC

Vanderbilt University Medical Center

WES

Whole exome sequencing

Introduction

Cancer is a heterogeneous genetic disorder that causes overactivation of cell division signals, eventually precipitating uncontrolled cell proliferation, invasion, and metastasis [ , ]. It is one of the leading causes of death globally. According to GLOBOCAN [ ] 2018, 18.1 million newly diagnosed cases and 9.6 million deaths related to cancer were reported.

Carcinogenesis is a result of interaction between environmental factors and genetic elements of an individual over a period of time, which could bring about an abnormal stimulation of protooncogenes or inhibition of tumor suppressor genes [ ]. The exact mechanism of such genetic instabilities at the level of chromosome and nucleotides accountable for the progression and heterogeneity remains ambiguous [ ]. Therefore, deep insights are sought from advanced bioinformatics and computational simulation techniques to unravel the molecular mechanisms lurking behind these invasive oncogenic processes.

Current cancer therapy: are we ready to battle cancer?

The current chemotherapeutic approaches are unsuccessful in targeting the stem cells from which cancer cells originate, and they merely focus on a limited number of genetic mutations that may not account for massive genetic variations linked with malignancies. Moreover, these therapies are imprecise as they presume normal somatic cells to possess malignant potential, thereby imposing cytotoxicity. On the other hand, deficient activation of certain enzymes that are responsible for conversion of prodrugs to their active forms, aberrant drug transporters or efflux pumps that undermine the drug concentrations within the cancer cells, irreparable DNA damage after a direct or indirect insult, and evasion of apoptosis are some of the factors that are likely to culminate in a drug gaining resistance. Although downsizing of tumor is evident after successful completion of chemotherapy cycles, there exists a plethora of cases where these agents fail to totally eliminate cancer stem cells with metastatic potential, thereby resulting in recurrence. This instigated the researchers to develop new drug regimens or combinational therapies for a successful treatment [ ].

De novo drug discovery versus drug repurposing

Novel drug discovery is a complex process that consumes an average of 10 to 15 years for translation of a new molecule into an approved drug. The drug approval process is tedious and encounters higher attrition rates due to changing regulatory requirements. These rate-limiting steps in de novo synthesis of a drug necessitated a paradigm shift from conventional drug discovery pathways to contemporary drug repurposing research to expedite unraveling new indications for existing, banned, and investigational drugs ( Fig. 4.1 ).

Figure 4.1, Evolution of drug repurposing.

Drug repositioning bypasses the elaborative processes involved in conventional drug development and dramatically reduces the time required. It demands an investment of 1600 million USD in contrast to the 12,000 million USD required for traditional drug discovery. Furthermore, the failure rates are low as safety profiles of repurposable drugs are already established. The lower cost involved in drug repositioning research is advantageous for economically backward countries to satisfy their unmet medical needs [ ]. Latest update on repurposed drugs in cancer therapy is listed in Table 4.1 .

Table 4.1
List of recently repurposed drugs for cancer with their respective clinical trials status.
Repurposed drug Original indication New indication Status
Ramucirumab [ ] Advanced gastric or gastroesophageal junction adenocarcinoma Hepatocellular carcinoma Approved
Pembrolizumab [ ] Metastatic melanoma Metastatic small cell lung cancer Approved
Artesunate [ ] Malaria Breast cancer NCT00764036
Suramin [ ] Sleeping sickness Breast cancer NCT00054028
NCT00003038
Thalidomide [ ] Morning sickness Esophageal cancer
Advanced Colorectal cancer
NCT01551641NCT00890188
Papaverine [ ] Smooth muscle relaxant Non-small cell lung cancer
Prostatic hyperplasia treatment and cancer prevention
NCT03824327
NCT03064282
Metformin [ ] Diabetes mellitus Prostate cancer
Breast cancer
NCT03137186
NCT00984490
NCT00897884
NCT01302002
Lenvatinib mesylate [ ] Thyroid cancer Hepatocellular carcinoma
Unresectable thyroid cancer
Recurrent endometrial or ovarian cancer
NCT03663114
NCT02430714NCT02788708
Quinacrine [ ] Malaria and giardiasis Non-small cell lung cancer
Prostate cancer
NCT01839955
NCT00417274

Fundamental steps for a fruitful drug repurposing expedition

Drug repurposing mandates a thorough insight of polypharmacology, which facilitates the exploration of multitarget actions of a single drug and its involvement in other disease pathways. Latest information pertaining to unexplored pathways involved in disease pathogenesis and progression, along with their associated biomarkers, is required before initiating drug repurposing approaches. Drug repurposing investigations oriented toward genetic disorders demand supportive literature on the influence of environment and drugs on gene expression, transcription, translation, epigenome, and metabolism. The data acquired and accumulated over years are massive and too huge to be handled manually. This situation has enforced knowledge-based, signature-based, target-based, and network-based computational approaches to untangle the hidden relationships across drugs, targets, and diseases [ ].

Inclusion of novel informatics approach, systems biology, and genomic information to reveal unknown targets or mechanisms of approved drugs improves drug repurposing methods by accelerating the timelines. The compounds derived from computational studies can be further validated through experimental testing. Hence, a combination of both computational and experimental assays is desirable to repurpose drugs for new indications [ , ] ( Fig. 4.2 ).

Figure 4.2, Polypharmacology concept exploring off-target actions of drugs and its downstream signaling.

Drugs are repurposed by employing omics data, such as genomics [ ], transcriptomics [ ], proteomics [ ], epigenetics [ ], and metabolomics [ ]. Alongside omics databases, electronic health records and side effect data [ ] also provide valuable hints to predict novel indications of existing drugs [ ]. Further progress in this field has led to the construction and incorporation of mathematical algorithms and Machine Learning (ML) platforms for a rapid and accurate drug repurposing forecast analysis [ , ] ( Fig. 4.3 ).

Figure 4.3, Various in silico approaches involved in drug repurposing.

Global funding initiatives and evolving big data drug repurposing projects

In the wake of new horizon of drug repurposing in cancer, a number of funding schemes were initiated by both governmental and philanthropic agencies. The National Institute of Health (NIH), started the National Centre for Advancing Translational Sciences, which funds the development of novel therapeutic possibilities by various initial in silico predictions [ ]. The Belgian initiative called the Anticancer Fund in collaboration with The Global Cures of USA, cofounded the Repurposing Drugs in Oncology [ ] project, to screen and test the anticancer potential of the existing noncancer therapeutic armamentarium and redirect them for cancer therapy via drug repurposing. Apart from these, Repurposing of Drugs: Innovative Revision of Cancer Treatment [ ], Clinical and Translational Science Awards [ ], Findacure [ ], Global Cures [ ] and Cures Within Reach [ ] are few other renowned funding agencies that are working toward addressing the challenges encountered in Cancer Drug (CD) discovery.

Following the advent of funding programs, big data projects like The Cancer Genome Atlas (TCGA) [ ], International Cancer Genome Consortium (ICGC) [ ], NIH Library of Integrated Network-Based Cellular Signatures (LINCS) [ ], Cancer Genome Project (CGP) [ ], Clinical Proteomic Tumor Analysis Consortium [ ], Cancer Drug Resistance database [ ], Oncomine [ ] etc., were constructed. The databases pertaining to cancer repurposing with their respective web links are tabulated in Table 4.2 . These databases are used widely to extract data and generate hypotheses for repurposing through in silico methods.

Table 4.2
List of omics databases for cancer drug repurposing.
Serial number Database Link References
1 TCGA https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga [ ]
2 ICGC https://dcc.icgc.org/ [ ]
3 CCLE https://portals.broadinstitute.org/ccle [ ]
4 Ensembl https://asia.ensembl.org/index.html [ ]
5 dbGaP https://www.ncbi.nlm.nih.gov/gap/ [ ]
6 DisGeNET https://www.disgenet.org/ [ ]
7 dbSNP https://www.ncbi.nlm.nih.gov/snp/ [ ]
8 dbVar https://www.ncbi.nlm.nih.gov/dbvar/ [ ]
9 COSMIC https://cancer.sanger.ac.uk/cosmic [ ]
10 LINCS http://www.lincsproject.org/ [ ]
11 GTEx https://gtexportal.org/home/ [ ]
12 GEO https://www.ncbi.nlm.nih.gov/pmc/ [ ]
13 ArrayExpress https://www.ebi.ac.uk/arrayexpress/ [ ]
14 Allen Brain Atlas https://portal.brain-map.org/ [ ]
15 Protein Data Bank (PDB) http://www.rcsb.org/ [ ]
16 Human Proteome Project https://hupo.org/human-proteome-project [ ]
17 SWISS-PROT https://www.iop.vast.ac.vn/theor/conferences/smp/1st/kaminuma/SWISSPROT/index.html [ ]
18 neXtProt https://www.nextprot.org/about/nextprot [ ]
19 GENCODE https://www.gencodegenes.org/ [ ]
20 ENCODE https://www.genome.gov/Funded-Programs-Projects/ENCODE-Project-ENCyclopedia-Of-DNA-Elements [ ]
21 CCDS https://www.ncbi.nlm.nih.gov/projects/CCDS/CcdsBrowse.cgi [ ]
22 Cancer Genome Interpreter https://www.cancergenomeinterpreter.org/home [ ]
23 Roadmap http://www.roadmapepigenomics.org/ [ ]
24 PsychENCODE https://www.nimhgenetics.org/resources/psychencode [ ]
25 BioGRID https://thebiogrid.org/ [ ]
26 KEGG https://www.genome.jp/kegg/pathway.html [ ]
27 Reactome https://reactome.org/ [ ]
28 SIDER http://sideeffects.embl.de/ [ ]
29 HPRD http://www.hprd.org/ [ ]
30 MINT https://mint.bio.uniroma2.it/ [ ]
31 GPS-Prot http://gpsprot.org/ [ ]
32 PINA https://omics.bjcancer.org/pina/ [ ]
33 MPIDB https://omictools.com/mpidb-tool [ ]
34 FAERS https://open.fda.gov/data/faers/ [ ]
35 IDAAPM https://omictools.com/idaapm-tool [ ]
36 VigiAccess http://www.vigiaccess.org/ [ ]
37 Genomics of Drug Sensitivity in Cancer (GDSC) http://www.cancerrxgene.org/ [ ]
38 Cancer Therapeutics Response Portal (CTRP) https://portals.broadinstitute.org/ctrp/ [ ]
39 STRING https://string-db.org/cgi/input.pl [ ]
40 cBioPortal https://www.cbioportal.org/ [ ]
41 UniProt https://www.uniprot.org/ [ ]
42 CGP https://icgc.org/icgc/cgp [ ]
43 CPTAC https://cptac-data-portal.georgetown.edu/cptacPublic/ [ ]
44 CancerDR http://crdd.osdd.net/raghava/cancerdr/ [ ]
45 Oncomine https://www.oncomine.org/resource/login.html [ ]
46 GWAS https://www.ebi.ac.uk/gwas/ [ ]
47 Tumorscape http://beroukhimlab.org/data-and-tools/ [ ]
48 UCSC Cancer Genomics Browser https://genome.ucsc.edu/ [ ]
49 IntOGen https://www.intogen.org/search [ ]
50 BioProfiling.de http://www.bioprofiling.de/ [ ]
51 CMap https://www.broadinstitute.org/connectivity-map-cmap [ ]
52 DrugBank https://www.drugbank.ca/ [ ]
53 TTD http://db.idrblab.net/ttd/ [ ]
54 CTD http://ctdbase.org/ [ ]
55 PubChem https://pubchem.ncbi.nlm.nih.gov/ [ ]
56 ChEMBL https://www.uniprot.org/database/DB-0174 [ ]
57 HMDB http://www.hmdb.ca/ [ ]
58 PharmGKB https://www.pharmgkb.org/ [ ]

Emergence of ML and Artificial Intelligence (AI) has increased the ease of drug repurposing predictions through integration of heterogeneous data using “garbage in, garbage out” method. Expanding supercomputing techniques like graphics processing unit and tensor processing unit has led to the generation of large diverse data and storage revolution. Many approaches utilize AI for predicting the pharmacological effects of chemicals or drugs by integrating medical data. Artificial Neural Networks (ANN), an outgrowth of AI, is employed to analyze the mechanism of action of drug molecules under the CD screening program initiated by National Cancer Institute (NCI) [ ].

This chapter elaborates on various repurposing mediums with examples of their practical application. In addition, fusion of foregoing methods and their recent advancements toward AI and ML world is brought to limelight to evince the upcoming cancer repurposing future.

Genomics: connecting genetics with drug repurposing in cancer

Genomics is an intriguing branch of omics sciences, composed of structural and functional genomics interlaced with elements of genetics [ ]. Structural genomics utilizes Next-Generation Sequencing (NGS), whole exome sequencing, and Single-Nucleotide Polymorphism (SNP) mi-croarray genotyping to identify tumor-specific mutations, copy number alterations, gene expression changes, gene fusions, and germline variants. On the other hand, functional genomics involves customizing the comparison between sequences of full-length complementary DNA and its respective genomic DNA to predict their corresponding transcriptomes and proteomes [ , ].

Owing to the labyrinthine genetic etiology of cancer, unearthing Differentially Expressed Genes (DEGs) using genomics is of immense assistance in identifying novel cancer targets for repositioning of drugs [ ]. In addition, this approach can be further employed to monitor the treatment efficacy and deduce mechanisms of resistance.

Hitherto, the establishment of global cancer genome mapping projects like TCGA [ ], ICGC [ ] and the Genome-Wide Association Studies (GWAS) [ ] has culminated in easy accessibility of information pertaining to genetic variations in cancer.

Besides, the dawn of user-friendly portals like Tumorscape [ ], University of California, Santa Cruz, Cancer Genomics Browser [ ], ICGC Data Portal [ ], Catalog Of Somatic Mutations In Cancer (COSMIC) [ ], cBioPortal [ ], Integrative Onco Genomics [ ], and BioProfiling.de [ ] helps in retrieving and statistically analyzing oncogene data. The consortium of genomic data also aids in reducing heterogeneity bias since it collates the data from a cohort of patients. Integration of various genomic sources provides a pool of targets for the drugs to act on. Despite its inherent advantages, cohort data usage lacks specificity to an individuals' genome. This limitation can be outdone by individualized genomic N-of-one studies, which are the frontiers for personalized medication management and drug repurposing [ ].

DeSigN tool: a web-based drug repurposing algorithm based on gene expression patterns

Lee et al. [ ] developed a web-based algorithm or tool named “DeSigN” (Differentially Expressed Gene Signatures) to predict phenotypic characteristics of drugs in cancer cell lines by considering their half maximal Inhibitory Concentration (IC 50 ) values and individual gene expression patterns. This algorithm construction took place in three steps. Firstly, a reference database with sensitivity patterns of cell lines to drugs extracted from Genomics of Drug Sensitivity in Cancer (GDSC) [ ] was created, which contained 140 drugs with their unique rank order–based gene signatures. The second step was generation of query inputs for DeSigN database using DEGs from microarray or RNA-Seq gene expression data of cell lines in tumor and control samples. In the third step, nonparametric modified Kolmogorov–Smirnov (KS) statistics, a rank order–based pattern-matching algorithm was implemented in Connectivity Map (CMap) [ ] to correlate the query signatures with specific drug-associated gene expression profiles. Later, the drug candidates were prioritized by computing connectivity score obtained from modified KS test. Eventually, to demonstrate the validity of the tool in predicting candidate drugs, four datasets (two estrogen receptor positive breast cancer, one non-small cell lung cancer, one pancreatic cancer) from the National Center for Biotechnology Information Gene Expression Omnibus (NCBI GEO) [ ] were selected. For all the included datasets, a DEG list was prepared to use as a query in DeSigN. The designed tool was experimentally validated among Oral Squamous Cell Carcinoma (OSCC) cell lines for identification of growth inhibitors. Thus obtained gene signatures containing 69 upregulated genes and 86 downregulated genes were used as a query in DeSigN that returned nine potential candidates namely, GSK-650394, pyrimethamine, RDEA 119, BIBW2992, CGP-082996, lapatinib, PF-562271, bosutinib, and PD-0325901. Among the abovementioned nine candidates, BIBW2992 and bosutinib were reported recently for their efficacy in head and neck squamous cell carcinoma cell lines. This in silico prediction for bosutinib was experimentally validated in ORL196, ORL-204, and ORL-48 OSCC cell lines, and the drug exhibited significant cytotoxicity at one micromolar concentration [ ].

Anticancer drug repositioning through genome-wide association studies

Zhang et al. [ ] designed an in silico pipeline that mapped 50 SNPs located in rectal mucosal cells obtained from National Human Genome Research Institute GWAS [ ] catalog to 140 genes associated with Colorectal Cancer (CRC) using snp2gene algorithm. The mapped genes were prioritized based on (i) functional annotation using Database for Annotation, Visualization, and Integrated Discovery [ ] bioinformatics resources, (ii) cis-expression quantitative trait loci effects using peripheral blood mononuclear cell data generated from 5311 European subjects, (iii) PubMed text mining (TM) via Gene Relationships among Implicated Loci tool [ ], (iv) Protein–Protein Interaction (PPI) analyzed through Disease Association Protein—Protein Link Evaluator [ ], (v) genetic overlaps with cancer somatic mutations by means of COSMIC database [ ], (vi) genes mapped with knockout mouse phenotypes from Mouse Genome Informatics [ ] database, and (vii) SNPs from linkage disequilibrium (r 2 > 0.80) that were annotated as missense variants using NIH Roadmap Epigenomics Mapping Consortium [ ]. Thereafter, Pearson correlation method was used for gene scoring, which ranged between zero and seven, wherein 35 genes that scored ≥ two were considered as biological risk genes. These top priority genes were used as query signature inputs to predict repurposable drugs from Drugbank [ ] and Therapeutic Target Database [ ]. This study revealed anticancer potential of crizotinib, arsenic trioxide, vrinostat, dasatinib, estramustine, and tamibarotene against CRC.

Proteomics: proteins to pave way for cancer drug repurposing

Proteomics, a branch of biology, which encompasses a cluster of technologies such as activity-based protein profiling, reverse-phase protein microarrays, and magnetic resonance spectroscopy to investigate and characterize the total protein content of a cell, tissue, or organism. It also deals with the analysis of PPI and protein–nucleic acid interactions and posttranslational modifications that influence the function of proteins. It provides comprehensive information pertaining to relative appraisal of normal and disease states, transcription/expression, side effects of drugs, and aids in biomarker identification. To add on, proteomics relies on polypharmacology concept that describes the role of a single protein in multiple pathways that might trigger multiple signaling mechanisms. These intricacies underpin the relevance of target-based proteomic approaches in novel drug discovery or repurposing [ ].

Current strategies of oncotherapy, so far, are oriented toward inhibiting DNA synthesis and abnormal signaling mechanisms. However, these mechanisms are sabotaged by drug resistance. This accentuates the application of high-throughput proteomic screening methods that not only facilitate identification of protein molecules interlaced in cellular network cascades associated with cancer but also illuminates the underlying molecular mechanisms of pathogenesis and disease progression [ ]. Designing these screening methods warrants structural information of target and drug. In due course, study of whole organism proteome, recognition of potential druggable binding sites, prediction of PPIs, and identification of interacting residues in the binding site are crucial in discerning potential drug candidates that can produce desired pharmacological effects. Further, innumerable target-based approaches like molecular docking, molecular dynamics, and pharmacophore modeling have been designed to explore potential druggable candidates.

You're Reading a Preview

Become a Clinical Tree membership for Full access and enjoy Unlimited articles

Become membership

If you are a member. Log in here