Biomarkers of diseases in medicine
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Biomarkers of diseases in medicine MANOJ KUMAR1 and SHIV K SARIN1,2 1 Department of Hepatology, G B Pant Hospital and Institute of Liver and Biliary Sciences, New Delhi 110 002, India. 2 Department of Gastroenterology, G B Pant Hospital, New Delhi 110 002, India. e-mail: shivsarin@gmail.com Biomarkers have gained immense scientific and clinical value and interest in the practise of medicine. Biomarkers are potentially useful along the whole spectrum of the disease process. Before diagnosis, markers could be used for screening and risk assessment. During diagnosis, markers can determine staging, grading, and selection of initial therapy. During treatment, they can be used to monitor therapy, select additional therapy, or monitor recurrent diseases. Advances in genomics, proteomics and molecular pathology have generated many candidate biomarkers with potential clinical value. In the future, integration of biomarkers, identified using emerging high-throughput technologies, into medical practise will be necessary to achieve ‘personalization’ of treatment and disease prevention. 1. Introduction of human biology and diseases began to evolve. So, why is so much attention being paid to bio- In 2001, a consensus panel at the National Insti- markers today? Genetics, genomics, proteomics, tutes of Health defined the term biomarker as and modern imaging techniques and other high- ‘a characteristic that is objectively measured and throughput technologies allow us to measure more evaluated as an indicator of normal biological markers than before. In addition, we achieve a processes, pathogenic processes, or pharmacologic greater understanding of disease pathways, the tar- responses to a therapeutic intervention or other gets of interventions, and the pharmacologic con- health care intervention’. The biomarker is either sequences of medicines. produced by the diseased organ (e.g., tumour) or by the body in response to disease. Biomark- ers are potentially useful along the whole spectrum 2. Phases of evaluation of biomarkers of the disease process. Before diagnosis, markers could be used for screening and risk assessment. Because of diseased tissue/tumour heterogeneity During diagnosis, markers can determine staging, and other biases that might be inherent with bio- grading, and selection of initial therapy. Later, they marker identification and evaluation processes, it can be used to monitor therapy, select additional is important that the identification of biomark- therapy, or monitor recurrent diseases [1]. Thus, ers should proceed in a systematic manner. Unlike identifying biomarkers include all diagnostic tests, a clinical trial design in which there are three imaging technologies, and any other objective mea- phases (phase I, phase II and phase III), research sures of a person’s health status. Biomarkers can on biomarkers has largely been guided by intui- also be used to reduce the time factor and cost for tion and experience. In 2002, the National Can- phase I and II of clinical trials by replacing clinical cer Institute’s ‘Early Detection Research Network’ endpoints. developed a five-phase approach to systematic dis- Biomarkers span a broad sector of human health covery and evaluation of biomarkers. In general, care and have been around since the understanding biomarker development should follow an orderly Keywords. Biomarkers; diagnosis; screening; prognosis. 403
404 MANOJ KUMAR AND SHIV K SARIN process wherein one proceeds to the next phase Table 1. Performance characteristics of biomarkers. only after meeting pre-specified criteria for the Disease Disease current phase [2]. present absent Phase 1 refers to preclinical exploratory studies. Biomarker positive A B Biomarkers are discovered through knowledge- Biomarker negative C D based gene selection, gene expression profiling or protein profiling to distinguish cancer and normal Disease prevalence: A + C/A + B + C + D; Negative likelihood ratio (LR−): (1 − sensitivity)/(specificity); samples. Identified markers are prioritized based Negative predictive value: D/C + D; Positive likeli- on their diagnostic/prognostic/therapeutic (pre- hood ratio (LR+): sensitivity/(1 − specificity); dictive) value that could suggest their evolution Positive predictive value: A/A+B; Specificity: D/B+ into routine clinical use. The analysis of this phase D; Sensitivity: A/A + C. is usually characterized by ranking and selection, or finding suitable ways to combine biomarkers. Although not required, it is preferred that the Phase V evaluates the overall benefits and risks specimen for this phase of discovery comes from of the new diagnostic test on the screened popula- well-characterized cohorts, tissue banks or from a tion. The cost per life saved is one example of an trial with active follow-ups. endpoint for such a study. This again requires a Phase II has two important components. Upon large-scale study over a long time period and could successful completion of phase I requirements, an also be prohibitively expensive. assay is established with a clear intended clinical Phases IV and V are necessary to evaluate bene- use. The clinical assay could be a protein-, RNA-, fits and risks of the use of a biomarker in screening DNA- or a cell-based technique, including ELISA, and detection. protein profiles from MS, phenotypic expression profiles, gene arrays, antibody arrays or quantita- tive PCR. To document clinical usefulness, firstly, 3. Characteristics of an ideal biomarker and such assays need to be validated for reproducibility basic statistical methods for evaluation and shown to be portable among different labora- tories. Secondly, the assays should be evaluated for • An ideal biomarker should be safe and easy to their clinical performance in terms of ‘sensitivity’ measure. and ‘specificity’ with thresholds determined by the • The cost of follow-up tests should be relatively intended clinical use. low, there should be proven treatment to modify During Phase III, an investigator evaluates the the biomarker. sensitivity and specificity of the test for the detec- • It should be consistent across genders and ethnic tion of diseases that have yet to be detected clini- groups. cally. The specimens analyzed in this evaluation phase are taken from study subjects before the If the biomarker is to be used as a diagnos- onset of clinical symptoms, with active follow-up tic test, it should be sensitive and specific and to ascertain disease occurrence. It is usually time- have a high predictive value table 1. A highly sen- consuming and expensive to collect these samples sitive test will be positive in nearly all patients with high quality; therefore, phase III should con- with the disease, but it may also be positive in sist of large cohort studies or intervention trials many patients without the disease. To be of clinical whenever possible. This is probably when most value, a test with high sensitivity should also have biomarker validation studies will end and the bio- high specificity; in other words, most patients with- marker will be ready for clinical use. out the disease should have negative test results. Phase IV evaluates the sensitivity and speci- For predicting the likelihood of disease on the ficity of the test on a prospective cohort. The major basis of the test result, rather than the converse, difference from phase III is that in phase IV a posi- the appropriate measures are positive and nega- tive test triggers a definitive diagnostic procedure, tive predictive values. Unfortunately, the positive often invasive and that could lead to increased eco- predictive value falls as the prevalence of the dis- nomic healthcare burden. Therefore, in a phase IV ease falls, so tests for rare conditions will have study, an investigator can estimate the false referral many more false positive results than true positive rate based on tested biomarkers and describe the results. extent and characteristics of the disease detected Diagnostic odds ratio (DOR) of a biomarker rep- (e.g., the stage of tumour at the time of detection). resents the comprehensive ability of the marker For rare diseases, phase IV requires a large cohort according to the following formula: with long-term follow-up and might often be too sensitivity 1 − sensitivity expensive as a stand-alone activity. These studies DOR = . are difficult to perform specifically for rare diseases. 1 − specificity specificity
BIOMARKERS OF DISEASES IN MEDICINE 405 Information about the diagnostic test itself can be are used to avoid mislabelling a person who is summarized using a measure called the likelihood actually free of the disease. Sensitivity and speci- ratio. The likelihood ratio combines information ficity calculated at various cut-off points generate about the sensitivity and specificity. It tells how a receiver-operating-characteristic (ROC) curve, much a positive or negative result changes the like- which ideally will be highly sensitive throughout lihood that a patient would have the disease. The the range of specificity. The most useful clini- likelihood ratio of a positive test result (LR+) is cal tests are typically those with the largest area sensitivity divided by 1 − specificity: under the ROC curve. The use of multiple tests may also be considered sensitivity LR+ = . for screening. When multiple tests are obtained in 1 − specificity series and the disease is considered present when The likelihood ratio of a negative test result (LR−) all tests are positive (‘AND rule’), specificity is is 1− sensitivity divided by specificity: enhanced whereas sensitivity is diminished. When multiple tests are obtained in parallel and the 1 − sensitivity disease is considered to be present when any of LR− = . specificity the tests are positive (‘OR rule’), sensitivity is enhanced and specificity diminishes [3]. The likelihood ratio for a positive result (LR+) Even if a biomarker meets several criteria that tells how much the odds of the disease increase make it ‘ideal’, this does not imply that the bio- when a test is positive. The likelihood ratio for a marker will necessarily be useful in a clinical set- negative result (LR−) tells how much the odds of ting. Specifically, if a novel biomarker cannot add the disease decrease when a test is negative. The value to tests and biomarkers are already being likelihood ratio can be combined with information used in clinical settings, then it may never pass the about the prevalence of the disease, characteris- sizeable hurdle that separates clinical practice from tics of your patient pool, and information about a clinical research. particular patient to determine the post-test odds of disease. To quantify the effect of a diagnos- tic test, information about the patient is needed 4. Specific ways to test if a biomarker adds first. The pre-test odds, such as the likelihood that to current risk assessment the patient would have a specific disease prior to testing should be specified. The pre-test odds are 4.1 Model discrimination usually related to the prevalence of the disease, though it might be adjusted upwards or down- The C-statistic, or area under the receiver opera- wards depending on characteristics of the over- ting characteristic curve (AUC) is a popular all patient pool or of the individual patient. Once method to test model discrimination. C-statistic pre-test odds have been specified, they are multi- for a multivariable model reflects the probability of plied by the likelihood ratio to give the post-test concordance among persons who can be compared odds: for a given outcome of interest and represents the probability that a case has a higher measure oddspost = oddspre × likelihood ratio. or risk score (or a shorter time to event in sur- The post-test odds represent the chances that a vival analyses) than a comparable control. The C- particular patient has a disease. It incorporates statistic measures the concordance of the score and information about the prevalence of the disease, disease state. The value of the C-statistic ranges the patient pool, and specific patient risk factors from 0.5 (no discrimination) to 1.0 (perfect dis- (pre-test odds) and information about the diagnos- crimination) and for the Framingham CHD risk tic test itself (the likelihood ratio). score, the C-statistic is approximately 0.76 [4]. Sim- Most biological markers, however, are not sim- ilarly table 2 shows AUC for various markers for ply present or absent but have wide ranges of val- HCC [5]. ues that overlap in persons with a disease and in When considering the efficacy of novel bio- those without it. The risk typically increases pro- markers in risk stratification, one approach is gressively with increasing levels; few markers have to determine to what extent entering the can- a threshold at which the risk suddenly rises, so didate biomarker into standard risk prediction various cut-off points must be evaluated for their models will actually increase the model’s C- ability to detect disease. Cut-off points with high statistic. For instance, a recent investigation in sensitivity, producing few false negative results, are the Atherosclerosis Risk in Communities Study used when the consequences of missing a poten- demonstrated the extent to which several indivi- tial case are severe, whereas highly specific cut- dual biomarkers increased the C-statistic for CHD off points, producing few false positive results, prediction above and beyond age, race, sex, total
406 MANOJ KUMAR AND SHIV K SARIN Table 2. AUC for various markers for HCC 4.3 Risk reclassification diagnosis. Test AUC SE (AUC) The utility of a biomarker may also be assessed by studying how biomarker information may lead to a AFP 0.647 0.027 reclassification of individuals in low medium- and DCP 0.688 0.083 high-risk categories based on traditional risk fac- AFP-L3 0.695 0.166 tors. The ultimate goal of this approach is to refine Abbreviations: AFP, alphafetoprotein; AFP- risk stratification, and it has been particularly L3, Lens culinaris agglutinin-reactive fraction emphasized when considering biomarker informa- of AFP; AUC, area under the curve; DCP, des-gamma-carboxyprothrombin; HCC, hepato- tion that would serve to shift individuals who are cellular carcinoma; SE, standard error. in the intermediate-risk groups (i.e., based on the Framingham risk score), upwards into the high-risk category or downwards into the low-risk category. cholesterol level, high-density lipoprotein choles- Recent guidelines have recommended that the indi- terol level, systolic blood pressure, antihyperten- viduals in the intermediate-risk category be tar- sive medication use, smoking status and diabetes. geted to undergo screening for existing sub-clinical This study concluded that out of the panel of atherosclerosis [9]. 19 novel biomarkers studied, lipoprotein-associated phospholipase A2, vitamin B6, IL-6 and soluble 4.4 Model validation thrombomodulin added the most to the C-statistic but each only increased it marginally (C-statistic Validation, generalizability and transportability increment range 0.006–0.011) [6]. of risk scores are significant characteristics of There are several limitations to using increments robust risk prediction models and have impor- in the C-statistic to determine the utility of bio- tant implications regarding the widespread utility markers in risk prediction [7]. First, the C-statistic of biomarkers. It is ideal if the formulation of a depends, to a large extent, on the magnitude of risk score uses separate derivation and validation the association (or odds ratio) between a dichoto- samples. In the absence of an independent vali- mous exposure and outcome. Other limitations of dation sample, the degree of over optimism in the C-statistic include low sensitivity for determin- the models could be judged by using bootstrap ing the relative importance of different risk factors estimations. Also, risk models that perform well in a multivariable model. in one population should be validated in other study samples. For instance, the Framingham risk 4.2 Model calibration score was originally developed in the Framingham heart study population, which largely consists A complementary step when analyzing the efficacy of individuals of white European ancestry. An of a biomarker is to assess the degree to which the investigation determined that among Japanese– biomarker improves model calibration. This can be American and Hispanic men and Native American thought of as the extent to which the expected women, the Framingham functions systematically risk (estimated by statistical models) agrees with over-estimated the risk of 5-year CHD events and the observed (or true) risk. This concept may be thus needed to be recalibrated (to risk factor and important when counselling patients with regards CHD levels within those populations) in order to to their numeric risk or probability of develop- maintain good performance [10]. ing a given condition. One statistical test that can be employed to compare these probabilities is 4.5 Considering multiple biomarkers use the Hosmer–Lemeshow calibration statistic. In the Women’s Health Study, investigators found that Although the ultimate aim of biomarker investi- deciles of predicted and observed 10-year CVD risk gations is to develop a parsimonious set of bio- for a multivariable model with Framingham risk markers that will most accurately predict disease score covariates plus C-reactive protein had a lower outcome, the reality is that several candidate bio- p-value than a multi-variable model not contain- markers in a multitude of separate studies have ing C-reactive protein (p = 0.039 vs. p = 0.23) [8]. already undergone evaluation. It is therefore dif- A simple statistical test to compare model discrimi- ficult to extrapolate the findings of these diverse nation with and without the biomarker of interest studies into one unifying conclusion which of the would fail to provide valuable information regard- several potential biomarkers add substantially to ing which specific groups (i.e., which deciles or risk prediction so as to be considered for mea- quintiles and so on) of observed and expected risk surement in routine practice. Figure 1 shows are better explained by including a biomarker of the use of single and multiple markers for HCC interest. diagnosis [5].
BIOMARKERS OF DISEASES IN MEDICINE 407 Table 3. High-throughput technologies. 1) Genomics - Genome sequencing - Genome variation - Genome annotation 2) Transcriptomics - Microarrays - Gene expression data 3) Proteomics - Y2H method - Mass spectrometry - Protein chips 4) Metabolomics - NMR - Mass spectrometry Figure 1. SROC curves for 3 diagnostic tests for HCC. Abbreviations: AFP, alphafetoprotein; AFP-L3, Lens culi- naris agglutinin-reactive fraction of AFP; DCP, des-gam- macarboxyprothrombin; HCC, hepatocellular carcinoma; SROC, summary receiver operating characteristics. 5. Biomarker discovery using high-throughput technology platforms 5.1 High-throughput technologies – basic premises Historically, some screening tools (e.g., pap smears and colonoscopy) have successfully reduced mor- tality through early detection. Despite these suc- Figure 2. Schematic representation of the uses of bio- markers across the spectrum of diseases. Before diagnosis, cesses, the field of early detection has been plagued markers might be used for risk assessment and screening. by problems of over diagnosis (e.g. PSA), inade- At diagnosis, markers can assist with staging, grading, and quate specificity of individual markers (e.g. CA125, selection of initial therapy. Later, they can be used to mon- CEA and AFP), low compliance (colonoscopy) itor therapy, select additional therapy, or monitor for recur- and a lack of analytical tools for discovering new rent disease. diagnostic markers. The limited number of useful markers has propelled investigators to use high- throughput platforms to identify large numbers of such as the ABI 3700 automate and multiplex candidate biomarkers. the Sanger method so that it can be utilized to High-throughput technologies are useful to sequence whole genomes. The ABI 3700 has the assess genomic data (which define the messages capacity to run 12 runs a day with 96 samples and the resulting protein sequences), transcrip- of ∼ 500 nucleotides long amplified DNA frag- tomic data (which reveal the levels of messages ments in parallel. This results in a nominal sequen- present), proteomic data (which give the lev- cing capacity of 576 kb a day (the whole human els of each protein present), and ‘fluxomic’ data genome is of the order of 3 billion bases). The (which, if it existed, would provide measure- reported accuracy of the ABI 3700 is 98.5% indica- ments of intracellular fluxes on a complete scale) ting that there are less than 2 errors per 100 bases table 3. sequenced [11]. Once the genome is sequenced, the next important task is the study of genetic 5.2 Genomics or genome variation between individuals. The types of variations that are commonly considered Genomics defines the genetic messages and the include single nucleotide polymorphisms (SNPs) resulting protein sequences. Modern sequencers and different types of repeats. SNPs are defined
408 MANOJ KUMAR AND SHIV K SARIN as single base variations between individuals that Table 4. Methodologies in proteomics. occur at high enough frequency in a population to 1. Protein interaction mapping be considered to be non-random. The reason for – Methods include yeast two-hybrid, co-immunopreci- interest in genomic variation is that these varia- pitation with mass spectrometry, and protein chips tions are a large part of what determines the differ- 2. Protein expression profiling ence between individuals especially when it comes – Same as gene expression profiling, but for proteins to susceptibility to various diseases and responses – Methods include 2DGE or LC coupled with mass spec- to drug treatments. The latter aspect is known trometry and protein chips as pharmacogenomics [12]. 3. Protein activity profiling – Usually done using protein chips 4. Protein modification profiling 5.3 Transcriptomics – For example, phosphorylation – Usually done using some mass spectrometry-based After genomics, transcriptomics is probably the approach best developed of the different high-throughput technologies. Transcriptomics could be defined as the study of the expressed mRNA transcript complement of a cell under different conditions. 5.4 Proteomics The central quantity in transcriptomics is the gene (or mRNA) expression profile of the cell. While Proteomics could be described as a large-scale mRNAs do not play as important a role in cellu- study of protein structure, expression, and function lar function as proteins, there are a number of rea- (including modifications and interactions). Some sons why one might prefer doing mRNA expression of the proteomic tasks and the methods used are profiling as opposed to protein expression profil- given in table 4. ing. The principal reason is quite practical though – nucleic acids (such as mRNA) are much easier to 5.5 Metabolomics separate, purify, detect and quantify than proteins. Also since protein concentrations can be considered In addition to genomics, trancriptomics, and pro- to be integrals of mRNA concentrations, the vari- teomics data, the changes in metabolite concentra- ability at the mRNA level is usually larger than tion levels in the cell can be used for analysis of the variability at the protein level. A third reason phenotypic behavior in the cell. Unlike genes that is simply that mRNA and protein expression mea- are encoded by 4 letters, or proteins that are made surements complement each other. from 20 amino acids, metabolites don’t have a set The major attraction in transcriptomics is that of codons and thus cannot be sequenced. Instead, the ability to measure mRNA concentrations of all they are characterized by their elemental compo- genes under any condition allows studying regu- sition, order of atoms, stereochemical orientation, lation of gene expression at a genome-wide scale. and molecular charge. The basic idea in transcription profiling is to mea- ‘Target analysis’ is the process of perturbing sure (usually relative) mRNA expression levels of one gene and measuring the effect of this pertur- thousands of genes simultaneously in a cell or bation on the concentration of a target metabo- tissue sample under specific conditions. All tran- lite (i.e., the metabolite of interest). If more than scription profiling techniques are based on the one gene is perturbed and the changes of a tar- process of hybridization, in which a cDNA tar- get metabolite is measured following such pertur- get from the sample to be studied is hybridized to bations, the analysis is referred to as ‘metabolite its complementary single stranded DNA probe on profiling’. ‘Metabolomics’ is a whole-cell measure- an array. The target cDNA is created by extract- ment of all the metabolites and it is considered to ing all mRNA from a sample, reverse transcribing be equivalent to transcriptomics in mRNA expres- the mRNAs to cDNAs, and simultaneously label- sion analysis. Metabolite concentration levels can ing the resulting cDNAs with a dye so that they also be measured in a high-throughput and quali- can be detected and quantified. The two stan- tative fashion. This is referred to as ‘metabolic fin- dard technologies for transcription profiling are ger printing’. Primary tools for such an analysis cDNA microarrays (where the DNA probe on the include NMR and mass spectrometry. array is a long cDNA), and Affymetrix Gene Chips The reason for using high-throughput (where the probe on the array is a short oligonu- technologies is that they provide a large number of cleotide). In addition to these major techniques correlative data on gene or protein expression in there are a number of more sensitive and flexible relation to disease. Such data are then analyzed for technologies that have been developed in recent their association to the disease. The assumption years. is that multiple variables will be able to provide
BIOMARKERS OF DISEASES IN MEDICINE 409 information on associations more accurately than a the major problems with high-dimensional data single variable (marker). Such strong associations derived from high-throughput genomic and pro- provide major impetus for the molecular profiling teomic technologies is overfitting of the data when approaches to find patterns or profiles for a clini- there are large numbers of potential predictors cal test based on high dimensional gene or protein among a small number of outcome events. For expression panels [13]. example, a recent study of RNA micoarray analy- Comparative genomic analyses have yielded a sis showed how easy it was to overfit data with large number of genomic expression data in rela- a small number of samples. Simon and colleagues tion to disease. The patterns of gene expressions clearly demonstrated that expression data on 6000 that are observed represent novel signatures for the genes from imaginary individuals, 10 normal and respective diseases and can be used to both develop 10 cases, could be used to discover discrimina- new clinical tests based upon gene expression pat- tory patterns, using one common method, with terns, and identify candidate markers for diagno- 98% accuracy [16]. Many of the so-called ‘omics’ sis and prognosis. For example, high-throughput derived data are subjected to a similar over-fitting platforms have been developed to screen genome- if the training and validation sets for analyses are wide methylation and single nucleotide polymor- small and not randomized. Most commonly used phism patterns (haplotypes) in tumour tissues approaches to analyze ‘omics’ data are artificial and body fluids. Aberrant DNA methylation of neural networks, boosted decision tree analyses, CpG dinucleotides is a common epigenetic alter- various types of genetic algorithms and sup- ation that contributes to colon cancer formation port vector machine-learning algorithms. Each [14]. Aberrant CpG island methylation results in approach has the potential to over fit the data. transcriptional silencing of genes and is a mecha- Over fitting has led to strong conclusions that nism for inactivating tumour suppressor genes in are likely to be erroneous. The first step, there- colon cancer. The methylated tumour DNA can fore, would be to determine whether the results be detected using methylation-specific PCR (MSP) are reproducible and portable. For this purpose, and thus has the potential to be used as a molecular information on samples should be blinded and marker for cancer. In colon cancer, the tumour samples be sent to several laboratories for running suppressor genes CDKN2A, MGMT and MLH1, the sample sets under a fixed protocol. The data as well as other genes (e.g., TIMP-3, p14ARF, from each laboratory should be analyzed by an APC, MINT31, MINT2 and THBS-1), are com- independent data manager to learn if each labo- monly methylated and are thus candidate mole- ratory reproduced a similar result. Splitting the cular markers for colon cancer. The methylation of samples randomly between ‘training sets and val- CDKN2A, MGMT, MLH1, MINT31, MINT2 and idation sets’ should minimize the over fitting. The other genes occurs early in the adenoma-carcinoma validation set should not contain samples used in sequence suggesting that these alterations could be training sets [17]. used for the early detection of colon cancer. Single nucleotide polymorphisms have also been used as genetic markers of risk, treatment response, 6. Types of biomarkers discovered using and gene and environment interactions in both rare high-throughput technologies and common cancers. For example, SNPs within BRCA genes, as well as in the surrounding regions, 6.1 DNA biomarkers are associated with breast and ovarian cancer risk. The HLA haplotypes have been found to correlate Increased serum DNA concentrations are asso- with the outcome of cytokine therapy for renal cell ciated with various types of cancers and with carcinoma. SNPs might also be useful for predict- other diseases such as sepsis and autoimmune dis- ing outcome of ‘chemoprevention’ (i.e. the use of ease. Mutations in oncogenes, tumour-suppressor one or several natural or synthetic substances to genes, and mismatch-repair genes can serve as reduce the risk of developing cancer, or to reduce DNA biomarkers. For instance, mutations in the the chance of cancer recurrence) [15]. oncogene KRAS predict metastatic spread in Similarly, comparative analysis of serum and various tumour types, and there are mutations plasma samples by MS-based techniques, such in the gene that encode the tumour suppres- as surface enhanced laser desorption ioniza- sor p53 in more than half of sporadic cancers. tion (SELDI)–MS has shown patterns of pro- Germline inheritance of a TP53 mutation (Li– tein/peptide features indicative of a range of Fraumeni syndrome) confers a risk of developing diseases, particularly cancer. many of the same cancers. Mutations in other These high-throughput technologies have signi- cancer-related genes, such as the RAS oncogene ficantly increased the number of potential DNA, or the tumour-suppressor genes CDKN2A (cyclin- RNA and protein biomarkers under study. One of dependent kinase inhibitor A, which encodes
410 MANOJ KUMAR AND SHIV K SARIN p16INK4A), APC (the adenomatous polyposis coli (CIMs) [21]. For example, pattern-based RNA- gene) and RB1 (the retinoblastoma gene), also expression analysis of clinical breast cancers has have the potential as markers for prognosis or selec- identified previously unknown molecular subtypes tion of therapy (see below) [18]. that are associated with differences in survival. Epigenetic regulation of transcription and trans- That analysis has also provided increased prognos- lation can also be important in carcinogenesis. tic capability, predicted response to neo-adjuvant Histone deacetylation, lysine-specific histone-H3 therapy, predicted the likelihood of metastasis in methylation, and promoter region CpG methyla- lymph-node negative patients and correctly pre- tion can function through transcriptional abro- dicted tumour grade from laser-capture microdis- gation of tumour-suppressor genes (e.g., APC sected specimens. The transcript levels of enzymes or the breast cancer 1 gene, BRCA1 ) or DNA important for drug metabolism have been used pre- mismatch-repair genes (for example, MLH1 or the clinically to predict the response to chemother- O6 -methyl-guanine-DNA methyl transferase gene, apy in lung and colon cancers. However, extensive MGMT ). They can also function through effects validation studies will be required, to move those on apoptosis, invasion and the cell cycle. Gene developments from clinical research to standard silencing by CpG methylation has received the practice in staging [19]. most attention, partly because sensitive methods of measurement have become available. It has been reported, for example, that differences in methy- 6.3 Protein biomarkers lation can distinguish prostate cancer from benign prostatic hyperplasia. Shedding of hyper methy- Most of the biomarkers in clinical use are single lated DNA into saliva from oral malignancies, proteins. Just as pattern-based RNA biomark- into sputum or bronchoalveolar lavage fluid from ers frequently outperform single RNA markers in lung cancer, and into serum from patients with tumour classification, prognosis or prediction of lung, bladder or colorectal cancer has also been response to therapy, protein-based ‘fingerprints’ demonstrated. Pharmacogenomic effects of methy- may outperform individual protein markers. Tech- lation silencing, with implications for choice of nologies such as differential in-gel electrophore- therapy, have also been shown. For example, pro- sis (DIGE), two-dimensional polyacrylamide gel moter region methylation of MGMT, an enzyme electrophoresis (2D-PAGE) and multidimensional that reverses 5 -guanine alkylation, predicts the protein-identification technology (Mud PIT) can response or resistance of tumours to nitrosourea be used for higher-throughput profiling with micro- alkylating agents [19]. gram quantities of protein. Other high-throughput Other potential DNA biomarkers include SNPs technologies, such as the reverse-phase microarray and mitochondrial DNA markers and oncoviral and surface-enhanced laser desorption ionization markers. Particular SNPs are associated with time-of-flight (SELDI-TOF) mass spectrometry, increased cancer risk and haplotype assessment can are more sensitive (in the femtomolar range) and be predictive of several cancers like breast, prostate can cover more of the 12 orders of magnitude and lung. Similarly mutations in mitochondrial range of serum-protein expression levels. Emerg- DNA occur in cancers of colon, bladder, head, neck, ing nanotechnologies, such as immuno-PCR, field lung etc. effect transistor (FET)-based protein detection and quantum dots, promise further increases in the sen- 6.2 RNA biomarkers sitivity of protein markers, but those techniques are currently experimental. Whereas most DNA markers are evaluated indi- Protein quantity by itself might not be the vidually, many high-throughput technologies can salient marker parameter. Protein function is assess mRNA expression comprehensively [20]. instead often dependent on phosphorylation, gly- Most RNA-based biomarkers undergoing clini- cosylation, and other post-translational modifica- cal evaluation consist of multi-gene molecular pat- tions, location in the cell and/or the location in the terns or ‘fingerprints’. Although such patterns can tissue. The important phosphorylation-dependent be more accurate than single-molecule markers, signalling cascades can be assessed, for example, choosing which genes to include in the pattern using reverse-phase arrays. Laser-capture microdis- adds an additional layer of statistical comple- section and similar technologies can be used to xity, prompting new developments in biostatistics, obtain DNA, mRNA or protein from precise loca- bioinformatics and data visualization. Molecular tions within a tumour and thereby distinguish markers and their patterns have been analysed markers inherent to the malignant cells from those by various supervised algorithms, most promi- in other cell types within the tumour. Microdissec- nently by double hierarchical clustering methods tion has enhanced expression profiling of various that lead to colour-coded ‘clustered image maps’ cancer types [22].
BIOMARKERS OF DISEASES IN MEDICINE 411 7. Biomarker use across the spectrum of biomarkers allows the identification of almost of diseases 80% of tumours with normal AFP, that represent the most difficult challenge for clinicians [23]. 7.1 Risk assessment Risk assessment is qualitative and quantitative 8. Diagnosis, treatment, prognosis and evaluation of the risk posed to human health by prediction of response the actual or potential presence of specific risk characteristics. For example, cardiovascular risk 8.1 Classification, grading and staging assessment by tables and charts based on the Framingham equation are widely used [4]. Vari- Classification of the tissue of origin of a disease ous biomarkers have been used to improve predic- especially malignancy is the first step towards pre- tion by Framingham score. Lipoprotein-associated dicting survival and choosing therapy. Because phospholipase A2, vitamin B6, IL-6, C-reactive a tumour’s anatomical location usually indi- protein and soluble thrombomodulin have been cates its tissue of origin, molecular markers are used [6]. rarely required. Histological examination generally confirms the diagnosis and identifies the tumour 7.2 Screening subtype. However, new molecular markers might sometimes be helpful in the differential diagnosis. Screening discriminates the healthy from the By using a combination of high-throughput RNA, asymptomatic disease state by screening particu- protein and tissue microarray technologies, mark- lar groups. Biomarkers are important for screening ers potentially useful for distinguishing colon and and early diagnosis. For example, the prognosis ovarian abdominal carcinomas from an unknown of advanced HCC is poor, whereas smaller HCC primary location can be identified [24]. suitable for organ transplantation, surgical resec- Each anatomical site has its own histological tion or radio frequency ablation have shown a grading system, designed to classify malignan- better prognosis and longer survival. Therefore, cies by degree of differentiation. Low-grade, well- detection of HCC at an early stage heavily affects differentiated tumours are usually less aggressive the clinical outcome of these patients. For this rea- and more favourable in prognosis than high-grade son, a surveillance program using alpha foetopro- tumours, which tend to grow faster and metasta- tein (AFP) and ultrasound (US) every six months size earlier. However, tumour grade is included in has been recommended, and is widely practised. formal TNM staging only when intimately linked So far, AFP, the only serological marker commonly to prognosis, as it is for soft-tissue sarcomas, used in diagnosis has failed to be a reliable marker prostate cancer and primary brain malignancies. mainly because it shows poor sensitivity, ranging Assignment of grade is inherently subjective and from 39% to 65% and a specificity ranging from dependent on the skill and experience of the 76% to 97%. AFP seems to be reliable at values reviewing pathologist, but several reports indi- over 400 IU/ml, but the percentage of patients with cate that biomarker patterns can correctly score such high levels is very small; this represents one of tumours according to their pathologist-assigned the most important limits of this marker. Various grades. Computer-aided diagnostic systems (CAD other markers for HCC diagnosis have been evalu- systems) have been used for preliminary grading ated including fucosylated variant of the AFP gly- of cervical smears and for assisted interpretation coprotein, having a high affinity of the sugar chain of radiological images such as screening mam- to Lens culinaris (AFP-L3), hepatoma-specific mograms, computerized tomography (CT) scans AFP and AFP-mRNA, Des-gamma carboxy pro- and standard X-ray films [25]. CADs are gener- thrombin (DCP), Glypican-3 (GPC3), squamous ally designed to make routine distinctions, giving cell carcinoma antigen (SCCA), immunoglobu- the pathologist time to focus on difficult diagnos- lins of the IgM class forming complexes with tic problems. The addition of either individual or either AFP (AFPIC) or SCCA (SCCAIC), tissue pattern-based biomarkers in the assessment of his- polypeptide specific antigen, hepatoma-specific tological grade could increase the utility of grading gamma-glutamyl transferase isoenzyme, trans- for predicting response to therapy. forming growth factor (TGF)-β1 and TGF-β1- The TNM Committee of the International mRNA, insulin-like growth factor (IGF)-II and Union Against Cancer (UICC), has defined stag- IGF-II mRNA and genetic alterations of telom- ing criteria for most anatomical sites. T, N and erase. However, individually used, these mark- M are determined separately and then grouped, ers don’t have good performance characteristics. usually to classify the cancer into one of four The combination of SCCA, SCCAIC, AFP and main stages (stages I–IV) and subdivisions thereof. AFPIC has been investigated. This combination Clinical staging, which is primarily used to guide
412 MANOJ KUMAR AND SHIV K SARIN initial therapy integrates information from physical to include alpha-2-macroglobulin (‘PGAA index’); examination with data such as those from standard ‘Fibrotest,’ which combines α-2-macroglobulin, X-ray, CT, MRI, PET, endoscopic examination, haptoglobin, GGT, apolipoprotein A1, and total biopsy, and surgical exploration. Pathological stag- bilirubin;], specialized tests of liver function [indo- ing on the basis of surgical specimens, if acquired, cyanine green; sorbitol; galactose clearance tests; complements clinical staging with a precise 13C-galactose breath test; 13C-aminopyrine breath determination of the extent of disease and addi- test and MEGX test], serum ECM markers of tional histological information. Increasingly, imag- fibrosis [‘Fibrospect panel comprising hyaluronic ing agents targeted at biomarkers are being used acid, TIMP1, and −2-macroglobulin; collagen IV; for anatomical localization. The most common are collagen VI; amino terminal propeptide of type III radioisotopes, detected by standard nuclear medi- collagen (PIIINP); apolipoprotein A-IV; comple- cine imaging, by single-photon emission computed ment C-4; serum retinol binding proteins; serum tomography (SPECT) or by PET. Also under N-glycans etc.] have been assessed and are being study are fluorescent molecules, which are detected developed for staging liver fibrosis [27]. by optical imaging, and paramagnetic particles for enhancing MRI. The target can be any marker that 8.2 Prognosis and treatment selection delineates the cancer or its metabolism. For exam- ple, (18)F-FDG, (11)C-acetate, and dual-tracer Tumour classification, stage and sometimes grade PET/CT have recently been shown to have a rela- are used to assess prognosis. Biomarker expression tively high sensitivity for the detection of extra- often supplants or complements tumour classifica- hepatic metastases of HCC and may be potentially tion, stage and grade when biologically targeted helpful in HCC staging [26]. therapeutics are under consideration. Promi- Some tumours (for example, carcinoid, phaeo- nent examples include CD20 positivity for treat- chromocytoma, and cancers of the prostate, ment of lymphomas with rituximab, HER2/NEU thyroid and colon) can be targeted by specific positivity for treatment of breast cancer with radiolabelled ligands. Carcinoid tumours, for trastuzumab, BCR-ABL translocation for treat- example, are often localized using a radiolabelled ment of chronic myelogenous leukaemia (CML) analogue of octreotide (111-indium pentetreotide), with imatinib, and KIT or platelet-derived growth which avidly binds to the somatostatin receptor, a factor receptor-α (PDGFRA) positivity for treat- protein commonly overexpressed in those tumours. ment of gastrointestinal stromal tumours (GIST) Nuclear medicine-based imaging modalities are with imatinib [19]. also clinically useful for evaluating tumour-related Both prognosis and prediction of response are phenomena including angiogenesis, apoptosis, pro- necessary for the selection of neoadjuvant or adju- liferation, metabolism, hypoxia and drug resistance vant chemotherapy. Tissue classification, TNM (such as P-glycoprotein function). Molecularly tar- staging, molecular biomarkers, grade and other fac- geted functional imaging has enormous potential tors might be used in combination for that purpose. for staging, as it does for other aspects of diagnosis The combinations of variables might not be easy to and management [19]. analyse manually, but computer decision support Staging could also be useful in non-malignant systems (DSS) can make the assessments automati- diseases. For example, from a clinical manage- cally [28]. Biomarkers can also be used to avoid ment viewpoint, accurately assessing the extent idiosyncratic drug toxicity such as the sustained, and progression of liver fibrosis in cases of chronic life-threatening leukocyte suppression seen when liver disease is important. Liver biopsy is the cur- mercaptopurine is given to leukaemia patients with rent gold standard but is poorly suited for active homozygous mutations of the thiopurine methyl- monitoring because of its expense and morbi- transferase (TPMT ) gene [29]. dity. Thus, development of alternatives that are safe, inexpensive, and reliable is a priority. There 8.3 Therapy monitoring have been tremendous advances in biomarkers for non-invasive assessment and staging of liver fibro- With advances in understanding of tumour sis. Table 5 shows the various blood biomarkers biology, interest in molecular biomarkers of car- evaluated for staging of liver fibrosis. Routine lab- cinogenesis has grown, both in terms of their oratory tests [aspartate aminotransferase (AST) prognostic significance and also their potential as to alanine aminotransferase (ALT) ratio; gamma therapeutic targets. For example, surgery, inclu- glutamyl transferase (GGT); cholesterol; platelet ding transplantation, remains the only potentially count; AST to platelet ratio and insulin resis- curative modality for HCC, yet recurrence rates tance], various proprietary test panels [‘PGA are high and long-term survival poor. The ability index,’ which combines prothrombin time, GGT, to predict individual recurrence risk and subse- and apolipoprotein A1, which was later modified quently prognosis would help guide surgical and
BIOMARKERS OF DISEASES IN MEDICINE 413 Table 5. Blood markers used to detect and stage liver fibrosis. Sensitivity/ Specificity PPV/NPV for advanced for advanced Name Components fibrosis fibrosis AST/ALT ratio AST/ALT 53%/100% 100%/81% ‘Forns’ test platelets, GGT, cholesterol 94%/51% 40%/96% APRI AST, platelets 41%/95% 88%/64% PGA index platelets, GGT, apolipoprotein A 91%/81% 85%/89% Fibrotest GGT, haptoglobin bilirubin, apolipo- 87%/59% 63%/85% protein A, alpha-2-macroglobulin Fibrospect hyaluronic acid, TIMP-1, alpha-2- 83%/66% 72%/78% macroglobulin FPI AST, cholesterol, HOMA-IR 85%/48% 70%/69% ELF collagen IV, collagen VI, amino termi- 90%/41% 35%/92% nal propeptide of type III collagen (PIIINP), matrix metalloproteinase 2 (MMP-2), matrix metalloproteinase 9 (MMP-9), tissue inhibitor of matrix metalloproteinase 1 (TIMP-1), tenascin, laminin, and hyaluronic acid (HA). Abbreviations: AST, aspartate aminotransferase; GGT, gamma glutamyl transpeptidase; APRI, AST to platelet ratio index; TIMP-1, tissue inhibitor of metalloproteinase 1; ECM, extracellular matrix; HOMA-IR – homeostasis model assessment (for insulin resistance). Table 6. Molecular markers of prognostic significance in hepatocellular carcinoma. Hepatocarcinogenic process Potential prognostic marker Proliferation, self-sufficiency in p53*, nm-23, Rb, PTEN*, c-met*, c-myc*, cyclin A, cyclin D, growth signals, insensitivity to cyclin E, p15, p16, p18, p19, p21, p27, p57, TGF-b, EGFR antigrowth signals family, growth factors proliferation indices* Avoidance of apoptosis p53*, Bcl-2, Bcl-xL, Bax, Bak, Bcl-xS, survivin Limitless replicative potential Telomerase (including TERT)* Sustained angiogenesis MVD, VEGF*, HIF-1a*, NOS, bFGF, PD-EGF, tissue factor, endostatin/collagen XVIII, interleukin-8, angiopoietin Tissue invasion and metastasis MMPs*, uPA, cadherin/catenin complex Genomic instability Chromosomal instability, aneuploidy*, microsatellite instability Abbreviations: nm-23, non-metastatic protein-23; Rb, retinoblastoma gene; PTEN, phosphatase and tensin homolog; TGF-b, transforming growth factor beta; EGFR family, epidermal growth factor recep- tor family; TGF-a, transforming growth factor alpha; HB-EGF, heparin-binding epidermal growth factor; TERT, telomerase reverse transcriptase; MVD, microvessel density; VEGF, vascular endothelial growth factor; HIF-1a, hypoxia-inducible factor-1 alpha; NOS, nitric oxide synthase; bFGF, basic fibroblast growth factor; PD-EGF, platelet-derived endothelial growth factor; MMP, matrix metalloproteases; uPA, urokinase plasminogen activator. chemotherapeutic treatment. As understanding of Research into the molecular biology of hepa- hepatocarcinogenesis has increased, the myriad of tocarcinogenesis has identified a multitude of genetic and molecular events that drive the hepa- molecular biomarkers with potential prognostic tocarcinogenic disease process, including angiogen- significance. Markers of particular interest include esis, invasion and metastasis, have been identified. p53-mutation, PTEN, c-met, c-myc, p18, p27, A number of molecular biomarkers with prognostic p57, serum VEGF, HIF-1a, MMP-2, -7, and -12, significance have been identified in hepatocellular as well as proliferation indices, telomerase activity carcinoma (table 6) [30]. and aneuploidy. Combining panels of molecular
414 MANOJ KUMAR AND SHIV K SARIN biomarkers with more traditional histopathological risk of toxicity and reduce the cost of treatment. characteristics may enable more accurate predic- The biggest challenge for researchers and clinicians tion of those at high risk of disease progression today is, to decide on which type of biomarker to and more appropriate targeting of resources. In use across the wide spectrum of disease processes. addition to biomarker expression in resected speci- In cancer, genomic studies are valuable because mens or biopsy samples, further emphasis should every cancer cell shows some degree of genetic dam- be placed on the role of circulating serum bio- age, which might not be present in normal cells markers. Assessment of molecular biomarkers in of the body. Contrary to genomic DNA markers, serum (for example pre-operative serum VEGF), phenotypic expression markers (RNA/protein) will as well as other body fluids including urine, may vary among cell types and change over time and allow formulation of pre-operative prognostic cri- show different posttranscriptional or posttransla- teria to identify patients most likely to benefit tional modifications. However, proteins, peptides from particular therapies, such as hepatic resection or metabolites are abundant, easily accessible in and transplantation, as well as predict those most body fluids, such as blood, urine, cerebrospinal likely to respond to different chemotherapeutic fluid and secretions, and show promise for measur- agents. It may be that high-risk patients achieve ing outcomes and studying changes in disease state. no advantage in undergoing hepatic resection com- Another challenge in characterizing biomarkers is pared to a less invasive treatment modality, such the complexity of the expression profile of poten- as tumour ablation, with its reduced morbidity, tial markers in benign conditions close to the dis- mortality, and cost. In addition, the ability to ease phenotypes. The evolving trend is the usage stratify patients’ prognoses pre-operatively would of patterns of markers instead of a single marker. improve provision of patient information when This approach could, to some extent, reduce the obtaining informed consent, allow assessment of error rate in predicting the outcome or severity of the need for adjuvant therapies, and facilitate side effects during the targeted therapies. comparative studies and clinical trials. Serum and With the increasing knowledge of the molecular urinary biomarkers may also have a potential role pathways underlying the development of various in screening for recurrent disease following treat- diseases, the selection of patients and their efficacy ment. Ho and colleagues [31] used microarray to in future will be based on molecular profiling or identify 14 genes that could discriminate between phenotypic expression of their target molecules in those patients with vascular invasion from those malignant tissues. These targeted drugs shut down without. They subsequently tested the prognostic their specific pathway or sets of pathways. The pre- value of this finding on a separate group, finding dictability of the response to targeted drugs rules a significantly poorer disease-free survival in those out their use in all patients, which helps to avoid patients predicted to have vascular invasion, and unnecessary drug-associated side effects. therefore to be at higher risk of recurrence. Work For example, HCC is a tumour with several by Iizuka and colleagues based on microarray genomic alterations. There is evidence of aberrant analysis identified a group of genes that could pre- activation of several signaling cascades such as dict intrahepatic recurrence with a positive predic- epidermal growth factor receptor (EGFR), Ras/ tive value of 88% and a negative predictive value of extracellular signal-regulated kinase, phosphoi- 95% [32]. nositol-3-kinase/mammalian target of rapamycin (mTOR), hepatocyte growth factor/mesenchymal- epithelial transition factor, Wnt, Hedgehog, 9. Drug development based on molecular and apoptotic signaling. Recently a multikinase biomarkers and targetted personalized inhibitor, sorafenib, has shown survival benefits therapies in patients with advanced HCC. This advance- ment represents a breakthrough in the treatment In the treatment of diseases especially cancer, of this complex disease and proves that molecular there is a shift from the traditional clinical prac- therapies can be effective in HCC. It is becoming tices to novel approaches. Traditionally, cancer apparent, however, that to overcome the comple- patients were treated with drugs of low toxicity xity of genomic aberrations in HCC, combination or of high tolerance regardless of their efficacy therapies will be critical. Phase II studies have in a given patient if the benefits of that drug tested drugs blocking EGFR, vascular endothe- are proven in both experimental and clinical con- lial growth factor/platelet-derived growth factor ditions. However, recent advances in basic and receptor, and mTOR signaling. Future research clinical research have provided opportunities to is expected to identify new compounds to block develop ‘personalized’ treatment strategies. These important undruggable pathways, such as Wnt novel approaches are intended to identify indivi- signaling, and to identify new oncogenes as tar- dualized patient benefits of therapies, minimize the gets for therapies through novel high-throughput
BIOMARKERS OF DISEASES IN MEDICINE 415 Table 7. Molecular targeted agents in clinical development in cancer. Cancer cell function Agent (type) Signal transduction Growth factor receptors EGFR Gefitinib (TKI), Erlotinib (TKI), Cetuximab (mAb), Panitumumab (mAb) HER2 Trastuzumab (mAb), Lapatinib (TKI) PDGFR Imatinib (TKI), Sunitinb (TKI), Sorafenib (TKI) FLT3 Lestaurtinib (TKI), PKC 412 (TKI), Sunitinib IGFR1 IMC-A12 (mAb), c-MET SU11274, JNJ-38877605, ARQ197 c-KIT Imatinib, Dasatinib Intracellular signaling RAS Farnesyl transferase inhibitor Tipifarnib RAF Sorafenib MEK Vandetanib, AZD6244 mTOR Temsirolimus, Everolimus, Rapamycin Angiogenesis Growth factors VEGF Bevacizumab (mAb) Growth factor receptors VEGFR Sorafenib, Sunitinib, Brivanib, Cediranib, Valatanib, IMC1121B (mAb) PDGFR Sorafenib, Imatinib, Sunitinib Apoptosis Intrinsic pathway BCL2 GX15-070, Oblimersen Extrinsic pathway Apo2L/TRAIL Mapatumumab, Apomab, AMG-655, rhApo/TRAIL Protein turnover Proteasome Bortezomib Chromatin remodeling HDAC SAHA DNAmethyltransferase Decitabine Cell cycle CDKs Flavopiridol (CDKI) Migration and invasion SRC Dasatinib, XL228 technologies. Biomarkers and molecular imaging classification of HCC based on genomewide inves- should be part of the trials, in order to opti- tigations and identification of patient subclasses mize the enrichment of study populations and according to drug responsiveness will lead to a identify drug responders. Ultimately, a molecular more personalized medicine [33] (table 7).
You can also read