Viewing entries tagged
Biological Sciences

Optimization of an Untargeted Metabolomic Platform for Adherent Cell Lines


Optimization of an Untargeted Metabolomic Platform for Adherent Cell Lines


Metabolites are the end products of all cell function. Metabolomics is the study of these small molecules such as sugars, lipids, and amino acids using analytical techniques including liquid chromatography coupled to mass spectrometry (LC-MS) and nuclear magnetic resonance (NMR). These instruments can measure differences between disease states and drug treatments as well as other physiological effects on cells. Ideal sample preparation is highly precise, accurate, and efficient with minimal metabolite loss. However, there is currently no universal method for isolating and analyzing these metabolites efficiently. Additionally, there is a lack of effective normalization strategies in the field of metabolomics, making comparison of results from different cell types or treatments difficult. The traditional method for data normalization uses protein concentration; however, the buffers used to extract metabolites effectively are not compatible with proteins, leading to protein precipitation and erroneous protein concentration measurements. Although time-consuming, cellular imaging would provide the best results in normalizing for cell number. We conducted a study comparing alternative filtration methods during sample preparation. Two types of filtration plates were analyzed for a variety of compound classes, features, and levels of sensitivity. Data normalization was tested by linearly increasing cell number and comparing whether protein concentration, cell debris weight, or DNA concentration increased at a rate that was proportional to cell number. Sample preparation was found to be more efficient using protein precipitation plates compared to phospholipid removal plates during lysate filtration. After testing several techniques to normalize data by cell number, DNA extraction from the cell debris demonstrated the most time-efficient, consistent, and reproducible results.


Metabolomics is the study of small molecules, such as lipids, amino acids, and sugars present within the cell.1,2 This field gives insight into cellular phenotype and function.3 Untargeted metabolomics is the common method used to repeatedly and quickly identify as many small molecules as possible at a specific occurrence using high resolution/high mass accuracy mass spectrometry.1-4 As a response to variations in environment or drug treatment, changes in metabolites can give insight into what is happening within the cell. Detection of these changes can facilitate identification of possible cancer oncogenesis.1

Despite the potential of metabolomics to provide new insights into cancer, it suffers from a lack of robust methodology. For example, at the time of this writing, the largest number of metabolites reported for a single study is approximately 400, whereas the number of metabolites in the human metabolome is estimated to be on the order of 40,000.2 Since the amount and types of measured metabolites are largely influenced by the method chosen for sample preparation, one focus of this report is sample preparation-related factors that could improve both the number of measured metabolites and the accuracy or precision with which particular metabolites are measured.2

Ideal sample preparation involves a procedure that is reproducible, fast, simple, and free of metabolite loss or degradation during extraction.2 During collection, the noise-to-signal ratio is problematic, but there are several ways to decrease it. One method focuses on protein precipitation. In this technique, metabolites are extracted from cells using solvents that quench metabolism and are filtered to remove protein and cell debris.5 Using a different approach, Ferreiro-Vera et al. found that using solid-phase extraction (SPE) to remove phospholipids yielded three- to four-fold up-regulation of metabolite signals compared to liquid-liquid extraction (LLE).5 We chose to focus on SPE due to its significant increase in metabolite signals and its simpler workflow.

Another problem associated with metabolomic data is the absence of robust methods for data normalization. Because raw metabolite measurements exhibit variation attributable to a number of sources, identification of a normalizer capable of reducing or eliminating the effects of such variation is important. Typically, total protein content or cell numbers are used.6 However, measuring total protein is not straightforward, as the buffers needed to extract metabolites are not compatible with proteins; for example, methanol in the buffers causes protein precipitation. While cell imaging as a surrogate for quantifying cell number is a good option, it is time-consuming and therefore not preferred because of the time-sensitivity of several drug treatments.6 Thus, an alternative, faster method that employs the same sample for both metabolite extraction and data normalization is needed.

Mass spectrometry (MS) was preferred over NMR spectrometry because MS has greater sensitivity, selectivity, and range of identifiable metabolites.11 One weakness of MS relative to NMR, however, is that the sample cannot be recovered. MS allows the user to easily identify the metabolite by comparing the molecular weight of the molecule to the Tandem mass spectrometry (MS/MS) spectra and retention time of the sample.11

The overall objectives of our research were to improve metabolomic data quality by 1) identifying a robust surrogate for the cell number method to facilitate data normalization and 2) optimizing sample preparation methods. Two different sample filtration strategies were tested—one using a protein precipitation plate and a second using a new phospholipid-depleting filter plate. Additionally, several data normalization strategies were evaluated. Finally, we applied the resulting optimizations to investigate the mechanism of action of L-asparaginase through its metabolic response.


Cells Lines

OVCAR-8 and OVCAR-4 are ovarian cancer cell lines from the subpanel of the NCI-60, one of the most highly profiled sets of cell lines in the world. Unlimited, easily obtained, and manipulated, OVCAR-8 cells are sensitive to L-asparaginase, while OVCAR-4 cells are resistant.10 The U251 central nervous system line and MDA-MB-435 melanoma line were also used.9 All cells were routinely maintained in RPMI-1640 media containing 5% fetal bovine serum (FBS) and 1% (2 mM) L-glutamine. Cultures were grown in an atmosphere of 5% CO2 and 90% relative humidity at 37°C.

Sample Preparation

Samples were prepared through filtration of the supernatant, vacuum-assisted evaporation to dryness, and reconstitution of the dried sample. Cell samples in 10 cm dishes were placed on dry ice and washed twice with 4-5 mL of 60% CH3OH, 40% HPLC-grade H2O, and 0.85% (NH4)HCO3, then extracted using 500 µL of CH3OH:CHCl3:H2O (7:2:1) buffer.7,8 The cells were homogenized for 30s to lyse the cells. The cells were then centrifuged at 20,000 x g at 4°C for 2 min. The lysate was removed and inserted into the protein removal plate while the remaining debris in the cell was saved. In order to prepare for the DNA extraction, the cell debris is air dried because speed vacuuming makes it difficult for the pellet to fully solubilize.


A phospholipid removal plate was used because Ferreiro-Veraet al. suggested it as a better way to effectively remove noise- or suppression-imparting metabolites.5 We compared the protein removal plate to the phospholipid removal plate. Each well was filled with 500Lfresh CH3OH with 10 µM xylitol. 200 L of the lysate was then added to each well. After using the vacuum to filter the lysate through the plates, the samples were dried using a speed vacuum and later reconstituted and analyzed in a hybrid ion trap MS.

Metabolite Analysis

Samples stored at -80°C were reconstituted in 50 µL of mobile phase A (deionized H2O + 0.1% Formic Acid (FA)) immediately prior to analysis with LC-MS. The LC parameters were as follows: 5µL of metabolomic sample were injected onto a C18 column (1.8µm particle size, 2.1 x 50mm), and analytes were eluted using a 25 min linear gradient of 100% H2O + 0.1% FA to 100% ACN + 0.1% FA. Analytes were detected in positive ion MS mode, and data was analyzed using XCMS Online software (Scripps Research Institute, La Jolla, CA). Averaged metabolite intensities for identified compounds were then plotted against cell number.

DNA Extraction

As a surrogate protocol for counting cell number, total genomic DNA was extracted and DNA concentration was quantified using a DNA extraction kit from IBI Scientific, following the Red Blood Cell Preparation protocol. After the cell pellets were completely air dried at room temperature, the pellet was reconstituted in 150 L of lysis buffer. The pellets were then stored at room temperature in the buffer for at least 24 h to allow for maximum resolubilization. DNA preparation was performed by adding GB buffer, shaking vigorously, and incubating at 60 °C for at least 10 min to allow for complete solubilization of the pellet. Afterward, the solution was treated with 100% C2H5OH and transferred to a GD maxi column. For large cell numbers, the solution was split between two columns to avoid saturation of the column. The columns were centrifuged, washed twice with wash buffer, and eluted with 50 L of elution buffer.



A phospholipid removal plate was evaluated against a protein precipitation filter plate in order to determine which plate yielded the highest intensity metabolites as well as the greatest variety of compound classes. As shown in Figure 1, both filtration plates yielded similar metabolite intensities for most of the known metabolites. However, in the case of the polyaminesputrescine and spermidine (Figure 1C), the protein removal plate significantly outperformed the phospholipid removal plate. Only one metabolite, glucosamine, displayed greater signal intensity with the phospholipid removal plate (Figure 1B). This result suggests that the phospholipid removal plate may also be useful for future analysis of other carbohydrates.

Protein Concentration and Cell Debris Mass

To attempt to normalize cell number, we tried measuring protein concentration. As shown in Figure 2, extraction of proteins with lysis buffer (Figure 2A) worked well in comparison to extraction with CH3OH/acetonitrile (Figure 2B), which showed low protein concentration. These were compared using levels of vehicle and L-asparaginase. To obtain better results, we tried another method for cell normalization using cell debris mass. Figure 3 shows that, in systems with fewer cell numbers, cell debris mass showed a low increase in weight; however, in samples with higher cell counts, the mass increased greatly. Neither cell debris nor protein concentration produced a linear relationship with cell number when using conditions and buffers required for metabolomics (Figure 2B and 3).


DNA concentration was measured for four different cell lines to evaluate its association with cell number. First, cell dilutions ranging from 0.25 – 4 million cells/dish were seeded and processed to determine DNA concentration. Figure 4 illustrates a positive linear correlation between cell number and DNA concentration among all four-cell lines investigated. Cell number was then evaluated by DNA concentration throughout a 48-hour incubation period. DNA concentration was recorded at 0, 4, 8, 24, and 48 hours after incubation. Analysis of the OVCAR-8 and OVCAR-4 cell lines suggests that a greater number of cells yielded faster growth (Figure 5). However, because cell doubling was not seen within the first 24 hours, these data suggest a slight lag phase in growth after feeding. This observation suggests that proximity between cells affects their growth (i.e., some lines grow faster once confluence exceeds a threshold level).

To test this hypothesis, OVCAR-4 cell growth was monitored by cell count (Figure 6) and DNA concentration (Figure 5B). Our results were consistent with the published doubling time of 39 hours for this cell line. However, the results also indicated a lag phase in the doubling of DNA concentration (Figure 5B) and cell number (Figure 6), although the latter lag phase was subtle.11 To validate the existence of the growth lag observed in OVCAR-4 cells, we investigated cell growth by DNA concentration for two additional cell lines: U251 and MDA-MB-435. Because cells of these two lines tend to grow individually, we expected to observe no lag phase. However, as shown in Figure 5C and 5D, both lines exhibited a growth lag. This lag is smaller than that of the OVCAR-4 line but greater than the growth lag of the OVCAR-8 line, as shown by the differences in DNA concentration between the 0 and 8 hour time points.


Metabolomic data quality can be dramatically affected by sample preparation and data normalization methods amongst many other factors. Ferreiro-Vera et al. reported that removing phospholipids provided improved signal intensity for majority of retrieved metabolites. Based on these results, we hypothesized that phospholipid depletion would improve metabolite signal intensity across a diverse set of metabolites extracted from the ovarian cancer cell line OVCAR-8.2 However, we found that a protein removal plate yielded greater signal intensity than the phospholipid removal plate for most metabolites. Specifically, the polyamines were significantly depleted in the phospholipid removal plate. Nevertheless, the phospholipid removal plate yielded a notable amount of glucosamine, suggesting that this plate may be helpful for the examination of carbohydrate metabolites.

Initial attempts to use protein concentration and cell debris mass from metabolomics samples failed to provide a solution for normalization. We initially tried normalizing cell number using protein. However, it was an issue because we miss the soluble proteins by measuring the debris. Protein concentration works well with bicine-chaps buffer, which keeps the proteins in solution. However, the buffers required for metabolite extraction do not preserve the proteins. Also, the methanol used for extraction causes precipitation of the proteins, making the concentrations measured extremely low. Realizing that the cell pellet could be put to use, we attempted to normalize cell number by cell debris mass. This method proved to be inefficient because the analytical scale was not precise enough to accurately give the same weight in repeated measurements. In the end, DNA extraction from the cellular debris proved to be an extremely effective method of normalization. Using OVCAR-8 and OVCAR-4 cell lines, DNA concentration was shown to increase proportionally to cell number. For time course experiments, cell number can therefore be normalized using DNA extraction from the metabolomic cellular debris.

In conclusion, this study has shown that protein precipitation plates are superior to phospholipid removal plates for most metabolite classes. However, for sugars, it may be more advantageous to use phospholipid removal plates. Rather than manually counting cells through a microscope, DNA extraction can efficiently normalize cell number without affecting the time required for metabolomic analysis.


I would like to thank Dr. John Weinstein for allowing me to have this wonderful research experience, Dr. Phil Lorenzi for supporting me and providing feedback to help make me a better scientist, and Dr. Leslie Silva for her daily guidance, advice, and support through my whole experience at MD Anderson Cancer Center. This research was supported by the University of Texas, MD Anderson Cancer Center.


  1. Dehaven, C.D., et al.; Organization of GC/MS and LC/MS metabolomics data into chemical libraries. J. Cheminform, 2010, 2, 9.
  2. Vuckovic, D.; Current trends and challenges in sample preparation for global metabolomics using liquid chromatography-mass spectrometry. Anal BioanalChem, 2012, 403, 1523-48.
  3. Jankevics, A. et al.; Separating the wheat from the chaff: a prioritisation pipeline for the analysis of metabolomics datasets. Metabolomics, 2012, 8, 29-36.
  4. Patti, G.J.; Separation strategies for untargeted metabolomics. J Sep Sci, 2011, 34, 3460-9.
  5. Ferreiro-Vera, C.; F. Priego-Capote; M.D. Luque de Castro; Comparison of sample preparation approaches for phospholipids profiling in human serum by liquid chromatography-tandem mass spectrometry. J Chromatogr A, 2012, 1240, 21-28.
  6. Cao, B. et al.; GC-TOFMS analysis of metabolites in adherent MDCK cells and a novel strategy for identifying intracellular metabolic markers for use as cell amount indicators in data normalization. Anal BioanalChem, 2011, 400, 2983-93.
  7. Xiao, J.F.; B. Zhou; H.W. Ressom; Metabolite identification and quantitation in LC-MS/MS-based metabolomics. Trends AnalytChem, 2012, 32, 1-14.
  8. Lorenzi, P.L. et al.; DNA fingerprinting of the NCI-60 cell line panel. Mol Cancer Ther, 2009, 8, 713-24.
  9. Carnicer, M. et al.; Development of quantitative metabolomics for Pichiapastoris. Metabolomics, 2012, 8, 284-298.
  10. deJonge, L.P. et al.; Optimization of cold methanol quenching for quantitative metabolomics of Penicilliumchrysogenum. Metabolomics, 2012, 8, 727-735.
  11. Shankavaram, U.T. et al.; CellMiner: a relational database and query tool for the NCI-60 cancer cell lines. BMC Genomics, 2009, 10, 277.
  12. Roberts, D. et al.; Identification of genes associated with platinum drug sensitivity and resistance in human ovarian cancer cells. Br J Cancer, 2005, 92, 1149-58.


Aquaporin-4 and Brain Therapy


Aquaporin-4 and Brain Therapy


One of the tasks of modern medicine is to address the many diseases affecting the brain. These maladies come in various forms – including neurodegenerative complications, tumors, vascular constriction, and buildup of intracranial pressure.1,2,3 Several of these disease classes are caused, in part or in full, by a faulty waste clearance or water flux. Although a pervasive system of slow cerebrospinal fluid (CSF) movement in the brain’s ventricles is present, a rapid method for clearing solutes from the cortex’s interstitial space, which contains neural tissue and the surrounding extracellular matrix, was unknown.4,5 Recently, Iliff et al discovered a new mechanism for the flow of CSF in the mouse brain – the “glymphatic” system.5,6 This pathway provides an accelerated mechanism to clear dissolved materials from the interstitial space, preventing a buildup of solutes and toxins.5,6 At the center of this system is the water transport protein aquaporin-4 (AQP4), and the extent of this water channel’s various roles are only now being identified. New perspectives on the mechanism by which the brain is “cleansed” may lead to breakthroughs in therapeutics for brain disorders such as Alzheimer’s Disease (AD), which is the sixth leading cause of death in the US each year.7

The Infrastructure

To understand AQP4 and its role in the brain, the environment in which it operates must be examined. As seen in Figure 1, surrounding the brain are three meninges, protective layers between the skull and the cortex.8 Between these layers – the dura, arachnoid, and pia maters – are cavities, including the subarachnoid space that lies just below the arachnoid layer.9 As the figure shows, directly underneath the pia mater are the cortex and interstitial space. Within the cortex, CSF flows through a system of chambers called the ventricles, illustrated in Figure 2.10 CSF suffuses the brain and has several vital functions, namely shock absorption, nutrient provision, and waste clearance.11 CSF is produced in a mass of capillaries called the choroid plexus and flows through the ventricles into the subarachnoid space, bathing the brain and never crossing the blood-brain barrier.10 After circulating through the brain’s interstitial space and ventricles, the CSF then leaves the brain through aquaporin channels surrounding the cephalic veins.6

A New Plumbing System

A team of researchers has discovered an alternate pathway for CSF that clears water-soluble materials from the interstitial space.6 CSF in this so-called “glymphatic pathway” starts in the subarachnoid cavity and then seeps into the cortex, as seen in Figure 3.6 This fluid eventually leaves the brain, carrying with it the waste generated by cells. CSF enters the parenchyma from the subarachnoid cavity and travels immediately alongside the blood vessels.6 This route, which forms a sheath around the blood vessels, is dubbed the “paravascular” pathway, and CSF enters and exits the interstitial space through these avenues.6 The pia membrane guides this pathway until the artery penetrates the cortex, as seen in Figure 3. From there, the endfeet of astrocytes bind the outer wall.6 Astrocytes are glial cells that play structural roles in the nervous system, and endfeet are the enlarged endings of the astrocytes that contact other cell bodies and contain AQP4 proteins.6,12

Iliff et al found that AQP4 are highly polarized at the endfeet of astrocytes, which suggested that these proteins provide a pathway for CSF into the parenchyma.6 To test this hypothesis, they compared wild type and Aqp4-null mice on the basis of CSF influx into the parenchyma.6 They injected tracers such as radiolabeled TR-d3 intracisternally, finding that that tracer influx into the parenchyma was significantly reduced in Aqp4-null mice.4 According to their model, AQP4 facilitates CSF flow into the parenchyma. There, CSF mixes with the interstitial fluid in the parenchyma; AQP4 then drives these fluids out and into the paravenous pathway by bulk flow.6 The rapid clearance of tracer in wild type mice and the significantly reduced clearance in Aqp4-null mice demonstrated the pathway’s ability to clear solutes from the brain. This finding is important because the build-up of Aβ is often associated with the onset and progression of AD.

AQP4 and Aβ

To facilitate Aβ removal, astrocytes become activated at a threshold Aβ, but undergo apoptosis at high concentrations.13,14,15 Thus, the concentration of astrocytes has to be in a narrow window. A study by Yang et al further explored the role of AQP4 in the removal of Aβ.16 They found that AQP4 deficiency reduced the astrocytic activation in response to Aβ in mice, and Aqp4-knockout reduced astrocyte death at high Aβ levels.16 Furthermore, AQP4 expression increased as Aβ concentration increased, likely due to a protein synthesis mechanism. Further investigation demonstrated that lipoprotein receptor-related protein-1 (LRP1) is directly involved in the uptake of Aβ, and knockout of Aqp4 reduced up-regulation of LRP1 in response to Aβ.15,16 Finally, AQP4 deficiency was found to alter the levels and time-course of MAPKs, a family of protein kinases involved in the response to astrocyte stressors.16 The role of AQP4 in cleansing the parenchyma as well as modulating astrocytic responses to Aβ thus pinpoint it as a major target for the following potential therapies: repairing defects in toxin clearance from the interstitial space, increasing expression in AQP4-deficient patients to increase astrocyte response, and knocking out Aqp4 in patients with high levels of Aβ to prevent astrocyte damage.

Sleep and Aβ Clearance

Interestingly, there is a link between Aβ clearance and sleep. Xie et al. studied Aβ clearance from the parenchyma in sleeping, anesthetized, and wakeful rodents, obtaining evidence that sleep plays a role in solute clearance from the brain. The researchers found that glymphatic CSF influx was suppressed in wakeful rodents compared to sleeping rodents.17 Glymphatic CSF influx is vital because it clears solutes from the brain in a way somewhat analogous to the way kidneys filer the blood. Real-time measurements showed that the parenchymal space was reduced in wakeful rodents, which led to increased resistance to fluid influx.17 Moreover, Aβ clearance was faster in sleeping rodents. Adrenergic signaling was hypothesized as the cause of volume reduction, implicating hormones such as norepinephrine.17 AQP4 is implicated in this phenomenon, as constricted interstitial space resists the CSF influx that this protein enables.

Aquaporin Therapy

If future treatment will target AQP4 function, then researchers must learn to manipulate its expression. However such regulatory mechanisms are not well understood. It is well-known that cells can ingest proteins in the plasma membrane and thus modulate the membrane protein landscape. Huang et al studied this phenomenon with AQP4, utilizing the fact that occluding the middle cerebral artery mimics ischemia and alters AQP4 expression in astrocyte membranes.18 They found that artery occlusion down-regulates AQP4 expression and discovered various mechanisms behind this response.18 Specifically, they determined that AQP4 co-localized in the cytoplasm with several proteins involved in membrane protein endocytosis, after the onset of ischemia.18 They posited that this co-localization indicates the internalization of AQP4.18 These correlations indicate that AQP4 is intimately connected with fluctuations in brain oxygen and nutrient levels, which are limited when blood flow is restricted.

Future Research

Aquaporin-4 is vital to many processes in the brain, but the range and details of these roles are not yet fully understood. As demonstrated, this protein is the central actor in the newly defined glymphatic system responsible for clearing solutes from CSF in the interstitial space. This function implicates AQP4 in the progression of AD and suggests other brain states and neurological conditions may have links to the protein’s function. Studies have demonstrated that AQP4 expression is dynamic, indicating that it can be regulated. The hope is that modulation of aquaporin expression or function could be used in brain therapy. Future research will no doubt focus on these mechanisms, and discoveries will aid in developing a treatment for various brain disorders.


  1. Goetz, C., Textbook of Clinical Neurology, 3rd Edition; Saunders: Philadelphia, 2007.
  2. Goldman, L. Goldman’s Cecil Medicine; Saunders Elsevier: Philadelphia, 2008.
  3. Karriem-Norwood, V. Brain Diseases, WebMD. (Accessed December 1, 2013).
  4. Crisan, E. Ventricles of the Brain, Medscape. (Accessed December 1, 2013).
  5. Scientists Discover Previously Unknown Cleansing System in Brain, University of Rochester Medical Center. (Accessed February 11, 2014).
  6. Iliff J.J. Cerebrospinal Fluid Circulation: A Paravascular Pathway Facilitates CSF Flow Through the Brain Parenchyma and the Clearance of Interstitial Solutes, Including Amyloid ß. Sci Transl Med 2012, 4, 147ra111.
  7. 2012 Alzheimer’s Disease Facts and Figures. Alzheimer’s Association. (Accessed December 6, 2013).
  8. Dugdale III, D. Meninges of the Brain. MedlinePlus, National Institutes of Health. (Accessed December 1, 2013).
  9. O’Rahilly, R.; Muller, F.; Carpenter, S.; Swenson, R. Chapter 43: The Brain, Cranial Nerves, and Meninges. Basic Human Anatomy. [Online] Dartmouth Medical School: Hanover, 2008. (Accessed December 1, 2013).
  10. Agamanolis, Dimitri. Chapter 14 Cerebrospinal Fluid. Neuropathology. [Online] (Accessed Dec. 1, 2013).
  11. Cerebrospinal Fluid (CSF), National Multiple Sclerosis Society. (Accessed December 1, 2013).
  12. Millodot, M. Astrocytes. Dictionary of Optometry and Visual Science, 7th edition; Butterworth-Heinemann: Oxford, U.K., 2009.
  13. Nielsen, H.M. et al. Glia [Online] 2010, 58, 1235-1246.
  14. Kobayashi, K. J Alzheimer’s Dis [Online] 2004, 6, 623-632.
  15. Arelin, K. Brain Research Molecular/Brain Research [Online] 2002, 104, 38-46.
  16. Yang, W. Mol Cell Neurosci [Online] 2012, 49, 406-414.
  17. Xie, L Science [Online] 2013, 342, 373-377.
  18. Huang, J. Brain Research [Online] 2013, 1539, 61-72.
  19. Almodovar, B. et al. Rev Cubana Me Top [Online] 2005, 57, 3, 230-232.
  20. Ibe, B.C., et al. J. Tropical Pediatr. [Online] 1994, 40, 315-316.
  21. Slowik, G. What Is Meningitis? eHealthMD. (Accessed Dec. 1, 2013).
  22. Iadecola C. and Nedergaard M. Glial regulation of the cerebral microvasculature. Nat Neurosci [Online] 2007, 10, 1369-1376.


Cancer Ancestry: Identifying the Genomes of Individual Cancers


Cancer Ancestry: Identifying the Genomes of Individual Cancers

Cancer-causing mutations in the human genome have been a subject of intense research over the past decade. With increasing numbers of mutations identified and linked to individual cancers, the possibility of treating individual patients with a customized treatment plan based on their individual cancer genome is quickly becoming a reality.

Cancer arises when individual cells acquire mutations in their DNA. These mutations allow cancerous cells to proliferate uncontrollably, aggressively invade surrounding tissues, and metastasize to distant locations. Based on this progression, a potentially tremendous implication emerges: if every type of cancer arises from an ancestor cell that acquires a single mutation, then scientists should be able to trace every type of cancer back to its original mutation through modern genomic sequencing technologies. High volumes of the human genome have been analyzed in search of these ancestor mutations using a variety of techniques, the most common of which is a large Polymerase Chain Reaction (PCR) screen. In this type of study, DNA of up to one hundred cancer patients is sequenced; the sequences are then analyzed for repeating codons, the DNA units that determine single amino acids in a protein. Analyzing the enormous volume of data from these screens requires the efforts of several institutions. The first 90 cancer-causing mutations were identified at the Johns Hopkins Medical Institute, where scientists screened 11 breast cancer and 11 colorectal cancer patients’ genomes.3 After this study was published in 2006, researchers found these 90 mutations across every known type of cancer. These findings stimulated even more ambitious projects: if the original cancer-causing mutations are identified, scientists may be able to reverse the cancer process by removing faulty DNA sequences using precisely targeted DNA truncation proteins.

However, such a feat is obviously more easily said than done. One of the many obstacles in identifying cancer genomes is the fact that approximately 10 to 15% of cancers derive large portions of their DNA from viruses such as HIV and Hepatitis B. The addition of foreign DNA complicates the search for the original mutation, since viral DNA and RNA are propagated in human cells. This phenomenon masks human mutations that may have existed before the virus entered the host cell. In addition, because tumors are inherently unstable, cancers may lose up to 25% of their genetic code due to errors in cell division, making the task of tracing them even more difficult. Finally, the mutations in every individual cancer have accumulated over the patient’s lifetime; differentiating between mutations of the original cancerous cell line and those caused by aging and environmental factors is an arduous task.

In order to overcome these challenges, scientists use several approaches. First, they increase the sample size—this strategy ensures that the mutations are not specific to an individual organism or geographic area but are common in all patients with that type of cancer. Second, accumulated data concerning viral genomes allow scientists to screen for and mark the areas of viral origin in patients’ DNA. Several advances have already been made despite the difficulties: for instance, in endometrial cancer—a cancer originating in the uterine lining—mutations in the Nucleotide Excision Repair (NER) and MisMatch Repair (MMR) genes have been found to occur in 13% of all affected patients.4 NER and MMR are involved in DNA repair mechanisms and act as the body’s “guardians” of the DNA replication process. In a healthy individual, both NER and MMR ensure that each new cell receives a complete set of functional chromosomes following cell division. In a cancerous cell, these two genes acquire a mutation that permits replication of damaged and mismatched DNA sequences. Similarly, mutations in the normally tumor-suppressing Breast Cancer Type 1 Susceptibility Gene 1 and Gene 2 (BRCA1 and BRCA2) have been identified as major culprits in breast cancer. In prostate cancer, E-26 Transformation Specific (ETS) and Transmembrane Protease, Serine 2 (TMPRSS2) are two DNA transcription regulatory proteins discovered to initiate the disease process.5

One of the latest frontiers in cancer treatment is the identification and study of individual, disease-causing mutations. Thousands of tumor genomes have been sequenced to discover recurring mutations in each cancer, and tremendous advances have been made in this emergent field of cancer genomics. Further study will ultimately aim to tailor cancer treatment to the patient’s specific set of mutations in the emerging field of personalized medicine. This strategy is already being used in the treatment of leukemia at the Cincinnati Children’s Hospital, where a clinical study has been underway since August 2013.2 This trial uses a combined treatment program that includes standard drug therapy while targeting a specific mutation in the mTOR gene, which is responsible for DNA damage repair. Thus, less than a decade after researchers first began to identify unique cancer-causing mutations, treatment programs tailored to patient genomes are becoming a reality.


  1. Lengauer, C. et al. Nature 1998, 396, 643-649.
  2. Miller, N. (accessed Nov 9, 2013).
  3. Sjöblom, T. et al. Science 2006, 314, 268-274.
  4. Stratton, M. et al. Nature 2009, 458, 719-724.
  5. Tomlins, S. et al. Science 2005, 310, 644-648.