The human genome is primarily composed of long stretches of repeat nucleotides that do not code for protein (only about two percent of the human genomes does code for protein). This mysterious, non-protein-coding DNA was once disregarded as junk DNA, but scientists have begun to find sequences of importance within this 'junk,' which is now sometimes called genomic 'dark matter.' Some of these sequences appear to have important regulatory functions, and can control the expression of some protein-coding genes. But studying these sequences can be extremely challenging, particularly because they are not like protein-coding genes that can be studied with standard techniques.
But scientists have now found a great use for the dark genome. Reporting in Science Translational Medicine, researchers created a method to reveal elements of the dark genome in cancerous tissue and in the bloodstream, as fragments called cell-free DNA (cfDNA). These bits of DNA are lost from tumors and they move around the body in the bloodstream. This technique may eventually help scientists or clinicians identify cancer or monitor the progress of treatment.
The approach is called ARTEMIS (Analysis of RepeaT EleMents in dISease). In this work, it was used to assess more than 1,200 repeat element features, which make up about half of the entire human genome. Many novel repeats that are linked to cancer and which change during the growth of tumors were identified. Other changes in cfDNA were revealed by this study too, which could help define new ways to detect different types of cancer.
"When you think about existing cancer genes and the DNA sequences around them, they're just chock full of these repeats," said Professor Victor E. Velculescu, MD, PhD, co-director of the Cancer Genetics and Epigenetics Program at the Johns Hopkins Kimmel Cancer Center.
The researchers searched for 1.2 billion tiny DNA sequences, or kmers, looking for unusual repeats. Many were enriched in cancer-linked genes. Of the 736 human genes that are thought to promote cancer, 487 of them had fifteen times as many repeats than expected. There was also an increase in repeat sequences in genes that are associated with cancer-linked signaling pathways.
"Until ARTEMIS, this dark matter of the genome was essentially ignored, but now we're seeing that these repeats are not occurring randomly," said Velculescu. "They end up being clustered around genes that are altered in cancer in a variety of different ways, providing the first glimpse that these sequences may be key to tumor development."
ARTEMIS was also combined with another method previously created by Velculescu and colleagues, called DELFI (DNA evaluation of fragments for early interception). DELFI can find changes in the length and distribution of cfDNA fragments. These models were able to predict when cancer would arise, and where the cancer would occur with an accuracy of almost 70 percent. The accuracy improved to 83 percent when the model was permitted to suggest a second possible type of cancer.
"Our study shows that ARTEMIS can reveal genome-wide repeat landscapes that reflect dramatic underlying changes in human cancers," said Akshaya Annapragada, an MD/PhD student at the Johns Hopkins University School of Medicine. "By illuminating the so-called dark genome, the work offers unique insights into the cancer genome and provides a proof-of-concept for the utility of genome-wide repeat landscapes as tissue and blood-based biomarkers for cancer detection, characterization, and monitoring."
Now, the researchers are planning to improve the method and apply it to clinical trials. It will hopefully soon improve early cancer detection, which could lead to better patient outcomes and potentially reduce the likelihood of returning cancer. "This is a totally new frontier," said Velculescu.
Sources: Johns Hopkins University School of Medicine, Science Translational Medicine