So-called junk DNA was given that unfortunate nickname because its function was so mysterious. These vast regions of the genome do not code for protein and are made up of highly repetitive sequences. But more recent work has revealed that there are sequences of junk DNA that can influence how protein-coding genes are expressed. Some of these sequences are called short tandem repeats (STRs); they comprise about five percent of the human genome. Now, scientists have revealed more about how STRs affect gene expression. The findings have been reported in Science.
The presence or absence of STRs have been associated with changes in gene activity, so scientists have known that they are not really junk, noted study author Polly Fordyce, PhD, an associate professor of bioengineering and of genetics at Stanford University.
In this work, the researchers wanted to know more about the interaction between STRs and transcription factors, which bind to DNA and can regulate how protein-coding genes are expressed. Previous work by other researchers has outlined how these transcription factors function and which sequences of DNA, also called motifs, which they preferentially bind, explained Fordyce. However, we still have not fully explained when and where transcription factors bind to non-coding regions of DNA to affect gene expression.
"To solve the puzzle of why transcription factors go to some places in the genome and not to others, we needed to look beyond the highly preferred motifs," Fordyce said. "In this study, we're showing that the STR sequence around the motif can have a really big effect on transcription factor binding, providing clues as to what these repeated sequences might be doing."
The investigators used special assays that were solely focused on DNA molecules and transcription factors. Thousands of experiments were run to compare the binding strength between transcription factors and thousands of DNA sequences, some of which had preferred motifs or were surrounded by different STRs, while others were not. The researchers wanted to know how changes in the DNA sequences influenced binding between DNA and transcription factors.
"We saw a surprisingly large effect. Varying the STR sequence around a motif can have up to a seventy-fold impact on the binding," said Fordyce.
Hundreds of transcription factors with mutations in their DNA binding domains were also created and tested; mutant transcription factors could not always recognize motifs or STRs. The study suggested that transcription factors interact with repetitive portions of the genetic code directly, and use their DNA binding domains to bind to these regions.
With results from over 6,000 experiments, the scientists identified some rules that seem to govern how transcription factors bind.
"We set out to study [STRS]. But the models we developed apply broadly to the entire regulatory landscape," noted lead study author and former Fordyce lab technician Connor Horton, now a graduate student at the University of California, Berkeley. "It helps us better understand how transcription factors bind to regulatory DNA, even when [STRs] aren't involved."
This research can also teach us more about various disorders too. "It's been known for some time that [STRs] are associated with increased or decreased risk of certain diseases," Horton said. Individual changes in STRs in people may alter how transcription factors bind, which changes gene expression and could be connected to disease.
Sources: Stanford University, Science