PgmNr 120: Multi-tissue analysis reveals short tandem repeats as ubiquitous regulators of gene expression and complex traits.Authors:
M. Gymrek; S. Feupe Fotsing; C. Wang; S. Saini; R. Yanicky; S. Shleizer-Burko; A. Goren
View Session Add to Schedule
Affiliation: Univ California San Diego, La Jolla, California.
Short tandem repeats (STRs) have been implicated in a variety of complex traits in humans. However, genome-wide studies of the effects of STRs on gene expression thus far have had limited power to detect associations and elucidate the underlying biological mechanisms. Here, we leverage whole genome sequencing and expression data for 17 tissues from the Genotype-Tissue Expression Project (GTEx) to identify STRs whose repeat lengths are associated with expression of nearby genes (eSTRs). Fine-mapping analysis reveals more than 3,000 high-confidence eSTRs, which are enriched in known or predicted regulatory regions. We show eSTRs may act through a variety of mechanisms, including controlling nucleosome positioning (homopolymers), altering affinity of transcription factor binding sites (dinucleotides), and modulating DNA or RNA secondary structure (GC-rich promoter repeats). We further apply co-localization analysis to identify hundreds of eSTRs that potentially drive published GWAS signals and implicate specific eSTRs in height, schizophrenia, and blood traits. For example, we identified a dinucleotide STR near the 3’ end of the gene RFT1 as a potential causal variant for height. To validate this finding, we imputed the STR into an independent cohort (eMERGE) and found a positive association between repeat number and height (p=0.0032). We additionally performed a dual reporter assay to test the effect of this STR in vitro and recapitulated the expected positive association between repeat number and expression (p=0.013). Overall, our results demonstrate that eSTRs potentially contribute to a range of human phenotypes. We expect that our comprehensive eSTR catalog will serve as a valuable resource for future studies of complex traits. Complete eSTR summary statistic data is publicly available and can be browsed interactively at webstr.gymreklab.com.