Our lab’s research focuses on the genetic factors and molecular mechanisms underlying age-related neurodegenerative diseases, in particular Alzheimer’s disease, related dementia, and Lewy body spectrum disorders. Our goal is to identify the precise causal and functional genetic variants underlying GWAS discoveries, and to understand their molecular mechanisms of action and the biological pathways through which they exert their pathogenic effects. To this end, we develop innovative in silico, in vitro, and in vivo approaches to study noncoding variants, specifically short structural variants, their corresponding trans factors, and their impact on the regulation of gene expression and splicing in the context of neurodegenerative phenotypes. Our research program has translational applications, informing clinical studies and supporting the development of novel genetic biomarkers and therapeutic targets.
Developing isogenic iPSCs-derived model systems for studying the common and distinct genetic and molecular mechanisms underlying synucleinopathies
The SNCA gene has been implicated in the etiology of several synucleinopathies, including Parkinson’s disease (PD), Dementia with Lewy body (DLB), Lewy body variant of Alzheimer’s disease (LBV/AD), and multiple system atrophy (MSA). It has also shown the strongest significant signal in multi-center genome wide association studies (GWAS) of PD, DLB, and MSA. Furthermore, accumulating evidence suggests that elevated levels of wild type α-synuclein protein, encoded by the SNCA gene, are causative in the pathogenesis of synucleinopathy spectrum disorders.
The most common synucleinopathies share a common pathological hallmark, Lewy bodies (LBs) and Lewy-related neurites; however, each disease presents distinct characteristics. The cell types and brain regions containing the LBs differ, particularly in early disease stages, so that while LBs in dopaminergic (mDA) neurons are the primary early disease characteristic of PD, early stages of dementia with Lewy Bodies (DLB) exhibits LBs primarily in the amygdala and cerebral cortex, as well as basal forebrain cholinergic neurons.
We study the common and distinct molecular mechanisms underpinning the contribution of SNCA locus to synucleinopathies. Modeling human neurodegenerative diseases, including synucleinopathies, by using induced pluripotent stem cells (iPSCs) leverages the use of cell culture systems in research of neurodegenerative diseases. The iPSC system has several advantages crucial to accomplishing our research goals. First, as an isogenic system, it allows us to examine the direct effects of specific genetic variants in a cell-specific manner on a common genetic background. Second, it enables us to perform comprehensive assessments of molecular phenotypes and cellular functions using living cells.
We have established isogenic iPSC-derived cholinergic and dopaminergic neurons, two neuronal types differentially involved in DLB and PD, respectively. The model system we have developed is suitable for investigating cell-specific regulatory mechanisms, involving both cis and trans factors, underlying synucleinopathies. Recently, using this model system we discovered as a pathology-distinct cis factor a genetic variant in the SNCA 3’UTR that specifically affects DLB risk and not PD; as a trans factor we found miRNAs differentially expressed in iPSC-derived cholinergic versus dopaminergic neurons.
Decoding the genetics of late-onset Alzheimer’s disease: From GWAS to functionally regulatory variants
In the post genome-wide-association-studies (GWAS) era we are shifting gears toward translation of genetic disease loci to molecular mechanisms of pathogenesis. Large multi-center GWAS have found associations between >20 genomic loci and late-onset Alzheimer’s disease (LOAD), and candidate genes were inferred by the proximity to the associated-SNP. However, the precise target genes within the LOAD-associated genomic regions and the causal variants affecting LOAD-risk have yet to be uncovered. Identifying alterations in neuron-, astrocyte- and microglia-specific gene expression profiles between normal and LOAD-pathological stages will advance the identification of the target genes within the LOAD-associated loci.
We developed a method to quantify cell-type specific (neuronal and glial) gene expression levels. We use archived frozen human brain tissue to prepare slides for rapid immunostaining, and isolate single neurons, astrocytes, or microglia, by laser capture microdissection (LCM). Following RNA extraction, we determine gene expression digitally using the nCounter Single Cell gene expression assay (NanoString). With our newly developed method we have been analyzing the expression profiles of genes within LOAD-risk regions in neurons, astrocytes, and microglia isolated from brain tissues of severe-LOAD, mild-LOAD, mild cognitive impairment (MCI), and normal control donors. We have then evaluated the association between expression changes and LOAD-pathological stages, and in a pilot study found differential expression of several critical genes, some of which have shown expression changes in early stages of the disease course, and therefore suggest that the expression regulation of these genes possibly has a causative role in the etiology of LOAD.
In collaboration with the Crawford lab we are now investigating chromatin accessibility profiles in LOAD-affected brain tissues and will use this data to map LOAD-specific regulatory elements. The heterogeneity of brain tissue makes it difficult to assess molecular characteristics of individual cell types. Frozen tissue presents additional challenges, as intact cell bodies are more difficult to isolate than intact nuclei. Our lab has employed the Fluorescence Activated Nuclei Sorting (FANS) method to extract, purify, and fluorescently label nuclei from frozen human postmortem brains. The sorted neuronal (NeuN+) and non-neuronal nuclei (NeuN-) are then subjected to ATAC-sequencing for characterization of neuron-specific chromatin accessibility within targeted LOAD-risk regions specific to the different LOAD pathological stages.
Structural variants and neurodegenerative diseases in aging: regulatory and causality consequences
Short structural variants (SSVs), i.e. variants other than SNPs, such as simple sequence repeats/ short tandem repeats (SSR/STRs), homopolymer stretches and indels, were not included in disease GWAS and expression quantitative trait (eQTL)-GWAS studies. Recently, there is increasing support for the idea that SSVs may be involved in many complex diseases and may contribute significantly to variation in human gene expression. This class of variants may affect phenotypes by altering the regulation of gene transcription, splicing, and translation, and it is by these mechanisms that SSVs may play a role in the etiology of human diseases, including neurodegenerative diseases in aging. However, the roles of noncoding SSVs in complex human diseases, including LOAD, PD and related disorders, and specifically the mechanisms whereby SSVs regulate gene expression and exert their pathogenic effects, have yet to be discovered.
We have developed in collaboration with Robert Saul, PhD and Michael Lutz, PhD, a bioinformatics tool, the SSV evaluation system, for evaluation and prioritization of noncoding SSVs based on likelihood to have a functional impact. For each SSV, an impact score is calculated using a weighted sum that includes terms for each variant that measure: (1) likelihood to have multiple polymorphic alleles, (2) synergy of consecutive variants, (3) gene structure context (intronic, exonic, 5’ and 3’ UTRs, promoter), (4) association with relevant trait/s (GWAS signals), (5) likelihood to affect gene regulation presumably via genetic-epigenetic interactions, (6) propensity for clusters of adjacent SSVs to contribute to a biologically important haplotype, and (7) evolutionary conservation. For a defined genomic region or gene/s, the distribution of scores identifies extreme values that have a higher likelihood of being polymorphic and regulatory/functional. This powerful bioinformatics system allows us to establish a short list of prioritized noncoding SSVs more likely to be causal and enriched for functional properties, such as enhancer elements.
We are launching a new initiative to accurately determine SSVs and haplotypes using single molecule, real-time (SMRT)-sequencing and a PCR-free system for target enrichment. This will be utilized in case-control association tests for identification of SSVs/haplotypes associated with neurodegenerative diseases of the aging brain. The bioinformatics analyses efficiently focus our SMRT-sequencing efforts on a smaller list of target regions that contain SSVs that are predicted to have a high biological impact, and may be causal variants. These target regions are enriched with a CRISPR/Cas9-based system and then phased sequenced using SMRT-sequencing technology (PacBio). This combined technology is highly suitable to accurately determine SSVs and haplotypes, as it provides long reads and does not require DNA amplification.
Polymorphic repetitive sequences in the etiology of neurodegenerative disorders
Our overarching goal is to uncover the molecular mechanisms underlying the reported associations of genomic regions with the risk of developing neurodegenerative diseases in aging. The central objective of this project is to gain a molecular, mechanistic understanding of how noncoding simple sequence repeats/short tandem repeats (SSRs/STRs), including homopolymer stretches, affect transcription regulation of key genes in the context of their association with diseases of the aging brain. This study is a collaborative effort with Raluca Gordân, PhD, and David Lukatsky, PhD, to test the hypothesis that differential non-consensus binding of TFs to the variable alleles at repeat sequence sites is responsible, at least in part, for differential expression of the cis-regulated genes, which can contribute to disease pathogenesis.
The broader contribution of SNCA gene to the wide spectrum of Lewy body disorders
A neuropathological hallmark of a group of neurodegenerative diseases, known as human “synucleinopathies,” is the presence of intracellular protein aggregates, Lewy bodies (LBs), upon postmortem brain examination. The alpha-synuclein protein is a major component of LBs. Parkinson’s disease (PD) is the prototype human synucleinopathy, and has been studied extensively. Genome-wide association studies (GWAS) have implicated the alpha-synuclein gene (SNCA) in PD, DLB and MSA. Several studies in cell-cultures and animal models reported that overexpression of wild-type SNCA can be toxic and may lead to cell death. Moreover, we documented elevated SNCA expression in patients that correlated with Lewy body presentation. One of the leading projects in our lab aims to uncover the molecular mechanisms regulating SNCA expression, and any genetic variability that impinges on this regulation, to advance the understanding of the pathogenesis of LB-related diseases. We conduct experiments using in silico, in vitro and in vivo approaches to decipher the genetic elements controlling SNCA expression, such as transcription, RNA metabolism including splicing; and to identify the precise functional and causal genetic variants that contribute to changes in the expression regulation of SNCA, and to the risk of developing LB pathologies in general or a particular disease in the spectrum. Our studies of the novel haplotype in a CT-rich region within SNCA intron 4, the Rep-1 complex repeat ~10kb upstream of SNCA, and a poly-T in the 3’UTR exemplify our strategy and discoveries.
The Role of TOMM40-APOE genomic region in late onset Alzheimer’s disease
APOE e4 is the strongest and most replicated genetic risk factor for late-onset Alzheimer’s disease (LOAD). APOE is located on chromosome 19 (19q13.32) in the tight gene cluster TOMM40-APOE-APOC1-APOC4-APOC2 that exhibits a strong linkage disequilibrium (LD). Genome wide association studies (GWAS) reported that the strongest association signal (by a wide margin) was also in the APOE LD region. This top association signal was attributed to the APOE e4 haplotype; however, it has been acknowledged that other genetic factors within this LD block may also explain the strong genome-wide significant signal and contribute, in part, to disease risk attributed to this genomic region.
This work has been carried out in collaborations with Allen Roses, MD, and his team, as well as Michael Lutz, PhD and W. Kirby Gottschalk, PhD.
We investigated the TOMM40-APOE genomic region associated with the risk and age of onset of LOAD to determine the functional consequences of a highly polymorphic, intronic polyT within this region (rs10524523) discovered by Roses et al. The TOMM40-polyT was associated with LOAD age of onset and other disease related endophenotypes, and we have identified an association with cognitive performance in normal aging. Specifically, we have been studying the regional regulatory effect of the TOMM40-polyT site on gene expression. In these investigations, we are using multiple model systems, including human brain tissues, cell-based reporter systems, mESCs, and humanized mouse models. We have found that the polyT acts as a regional transcriptional regulator of TOMM40 and APOE genes suggesting that the TOMM40-polyT locus may contribute to LOAD susceptibility by modulating TOMM40 and/or APOE mRNA expression levels. We have also manipulated the transcriptional effects on TOMM40 and APOE-C1 genes using two approaches: knockdown of PPARgamma with shRNA, and treatment with pharmacological agents that are known PPARgamma agonists, particularly low doses of the Pio and Rosi compounds.