Functional genomic analyses uncover APOE-mediated regulation of brain and cerebrospinal fluid beta-amyloid levels in Parkinson disease

Alpha-synuclein is the main protein component of Lewy bodies, the pathological hallmark of Parkinson’s disease. However, genetic modifiers of cerebrospinal fluid (CSF) alpha-synuclein levels remain unknown. The use of CSF levels of amyloid beta1–42, total tau, and phosphorylated tau181 as quantitative traits in genetic studies have provided novel insights into Alzheimer’s disease pathophysiology. A systematic study of the genomic architecture of CSF biomarkers in Parkinson’s disease has not yet been conducted. Here, genome-wide association studies of CSF biomarker levels in a cohort of individuals with Parkinson’s disease and controls (N = 1960) were performed. PD cases exhibited significantly lower CSF biomarker levels compared to controls. A SNP, proxy for APOE ε4, was associated with CSF amyloid beta1–42 levels (effect = − 0.5, p = 9.2 × 10−19). No genome-wide loci associated with CSF alpha-synuclein, total tau, or phosphorylated tau181 levels were identified in PD cohorts. Polygenic risk score constructed using the latest Parkinson’s disease risk meta-analysis were associated with Parkinson’s disease status (p = 0.035) and the genomic architecture of CSF amyloid beta1–42 (R2 = 2.29%; p = 2.5 × 10−11). Individuals with higher polygenic risk scores for PD risk presented with lower CSF amyloid beta1–42 levels (p = 7.3 × 10−04). Two-sample Mendelian Randomization revealed that CSF amyloid beta1–42 plays a role in Parkinson’s disease (p = 1.4 × 10−05) and age at onset (p = 7.6 × 10−06), an effect mainly mediated by variants in the APOE locus. In a subset of PD samples, the APOE ε4 allele was associated with significantly lower levels of CSF amyloid beta1–42 (p = 3.8 × 10−06), higher mean cortical binding potentials (p = 5.8 × 10−08), and higher Braak amyloid beta score (p = 4.4 × 10−04). Together these results from high-throughput and hypothesis-free approaches converge on a genetic link between Parkinson’s disease, CSF amyloid beta1–42, and APOE.


Introduction
Parkinson's disease (PD) is a neurodegenerative disease characterized by rest tremor, rigidity, bradykinesia, and postural instability [57]. It is the most common Ibanez et al. acta neuropathol commun (2020) 8:196 neurodegenerative movement disorder, affecting more than six million people worldwide, with its prevalence projected to double in the next several decades [29]. Aggregated and phosphorylated alpha-synuclein (α-Syn) is the main protein component of Lewy bodies (LB) and neurites, the pathological hallmark of Lewy body diseases. The gene dosage effect of the SNCA gene, which encodes α-Syn, correlates with cerebrospinal fluid (CSF) α-Syn levels and a more severe PD phenotype [30,55]. Common variants in the SNCA promoter are among the top genome-wide association studies (GWAS) signals for PD [54], suggesting that genetic control of CSF α-Syn level plays a role in PD phenotype variability. A modest but significant decrease (~ 10% to 15%) in CSF α-Syn levels has been reported in PD cases compared to controls [33] and is correlated with disease progression [1,13,35]. CSF α-Syn is not currently used as a clinical biomarker [35,53], but is a proxy for pathological brain α-Syn accumulation [64]. Therefore, identifying genetic modifiers of CSF α-Syn levels could provide insight into PD pathogenesis. To date, genetic modifiers of CSF α-Syn remain unknown.
The α-Syn accumulation in specific brain regions defines different subtypes of Lewy body diseases (LBD). However, pure α-Syn pathology is only found in 45% (brainstem), 32% (limbic) and 19% (neocortical) of LBD. Concomitant presence of amyloid beta (Aβ), tau, and TDP-43 are common findings in LBD. Thus, Aβ and tau pathology is present in up to 80% and 53% in cases of neocortical LBD, respectively [58]. LBD patients with concomitant Alzheimer's disease (AD) pathology exhibit a faster cognitive decline [44]. CSF levels of amyloid beta  (Aβ42), total tau (t-tau), and phosphorylated tau 181 (p-tau 181 ) are commonly used as proxies of Aβ and tau pathology in the brain [8]. A correlation between lower Aβ42 CSF levels and higher Braak stage scores of AD neuropathology was found in neuropathologically confirmed LBD cases [8]. In cross-sectional studies, PD cases exhibit lower CSF levels of Aβ42 compared to age and gender-matched healthy controls [14,52]. CSF levels of Aβ42 and t-tau levels are also associated with cognitive decline progression [52]. Decreased CSF Aβ42 levels predict the development of dementia in PD patients [47,63]. These results suggest that dementia-associated CSF biomarker profile signatures could be informative of brain pathology in PD patients. GWAS using CSF Aβ42, t-tau, and p-tau 181 levels as quantitative traits have identified genes involved in AD pathogenesis [24]. However, a systematic study of the role of genetic modifiers of dementia CSF biomarkers in PD has not yet been thoroughly evaluated.
This study aimed to uncover genetic modifiers of α-Syn, Aβ42, t-tau, and p-tau 181 CSF levels in PD patients by performing a large (N = 1960) GWAS meta-analysis of CSF biomarkers in PD cohorts. Polygenic risk scores (PRS) and Mendelian randomization (MR) analyses were integrated with the latest PD risk meta-analysis (META-PD) and CSF biomarker summary statistics to examine the causal relationship between CSF biomarkers and PD risk. This is the first comprehensive analysis of CSF biomarkers using GWAS, PRS, and MR in PD.

Study design
The goal of this study was to identify common genetic variants and genes associated with CSF α-Syn, Aβ42, tau, and p-tau 181 in PD. A three-stage GWAS was used: discovery, replication, and meta-analysis. The discovery phase included 729 individuals from the Protein and Imaging Biomarkers in Parkinson's disease study (PIB-PD) at the Washington University Movement Disorder Center [9] (n = 103) and the Knight ADRC [24] (n = 626). The replication phase included 1231 independent CSF samples obtained from PD cases and healthy elderly individuals from three additional studies [the Parkinson's Progression Markers Initiative (PPMI), Alzheimer Disease Neuroimaging Initiative (ADNI), and Spain]. Meta-analyses were performed using a fixed-effects model. Genetic loci that passed the multiple test correction for GWAS (p < 5.0×10 −8 ) were functionally annotated using bioinformatics tools to identify variants and genes driving the GWAS signal. PRS were used to test the correlation between CSF biomarkers and PD genetic architecture. Instrumental variables were selected from the summary statistics of CSF biomarkers, and MR methods were applied to test causality.

Biomarker measurements
α-Syn in CSF was measured in 107 samples from the WUSTL cohort [9] and the entire PPMI cohort, using a commercial ELISA kit (Covance, Dedham, MA) [45]. The additional samples (N = 622) from WUSTL were quantified using the SOMAScan platform (See below). Aβ42, t-tau, and p-tau 181 were quantified using the INNOTEST assay (WUSTL) and xMAP-Luminex with INNOBIA AlzBio3 (PPMI). The immunoassay platform from Roche Elecsys cobas e 601 was used in the ADNI cohort to quantify all four biomarkers. ELISA assays from Euroimmun (Germany) were used in the Spanish cohort to measure the CSF levels of α-Syn, Aβ42, t-tau, and p-tau 181 . The α-Syn levels were normalized by log 10 transformation. Aβ42, t-tau, and p-tau 181 values were normalized and standardized by the Z score transformation. Individuals with biomarker levels outside three standard deviations of the mean were removed from the analysis (Table 1).

Amyloid beta imaging
[11C]-Pittsburgh Compound B (PIB) acquisition and analysis were performed according to published methods [34]. Briefly, 10-15 mCi of the radiotracer was injected via an antecubital vein, and a 60-min, a three-dimensional dynamic PET scan was collected in 53 frames. Emission data were corrected for scattering, randoms, attenuation, and dead time. Image reconstruction produced images with a final resolution of 6 mm full-width half-maximum at the center of the field of view. Frame alignment was corrected for head motion and co-registered to each person's T1-weighted magnetization-prepared rapid gradient echo magnetic resonance scan [61]. For quantitative analyses, three-dimensional regions of interest (prefrontal cortex, gyrus rectus, lateral temporal cortex, precuneus, occipital lobe, caudate nucleus, brainstem, and cerebellum) were created by a blinded observer for each subject based on the individual's MRI scans, with boundaries defined as previously described [50]. Binding potentials (BP ND ) were calculated using Logan graphical analysis, with the cerebellum as the reference tissue input function [49,50]. Mean cortical binding potentials (MCBP) were calculated for each subject as the average of all cortical regions except the occipital lobe.

Neuropathologic analysis
The neuropathological analysis was done at WUSTL, as previously reported [47]. Briefly, brains were fixed in 10% neutral buffered formalin for 2 weeks. Paraffin-embedded sections were cut at 6 μm. Blocks were taken from the frontal, temporal, parietal, and occipital lobes (thalamus, striatum, including the nucleus basalis of Meynert, amygdala, hippocampus, midbrain, pons, medulla oblongata) and the cervical spinal cord. Histologic stains included hematoxylin-eosin and a modified Bielschowsky silver impregnation. The Alzheimer's disease pathologic changes were rated using an amyloid plaque stage (range, 0 to A-C) [7] and diffuse and neuritic plaques were also assessed. Cases were classified according to the neuropathologic criteria of Khachaturian [46], the Consortium to Establish a Registry

Genotyping
All cohorts, except PPMI, were genotyped using the Global Screening Array (GSA) Illumina platform. Genotyping quality control and imputation were performed using SHAPEIT [23] and IMPUTE2 [38] with the 1000 genomes as a reference panel. Single nucleotide polymorphisms (SNPs) with a call rate lower than 98% and autosomal SNPs that were not in Hardy-Weinberg equilibrium (p < 1.0×10 −06 ) were excluded from downstream analyses. The X chromosome SNPs were used to determine sex based on heterozygosity rates. Samples in which the genetically inferred sex was discordant with the reported sex were removed. Whole-genome sequence data from the PPMI cohort was merged with imputed genotyped data; only variants present in both files were included in further analyses. Pairwise genomewide estimates of proportion identity-by-descent tested the presence of unexpected duplicates and cryptically related samples (Pihat > 0.50). Unexpected duplicates were removed; the sample with a higher genotyping rate in the merged file was kept for those cryptically related samples. Finally, principal components were calculated using HapMap as an anchor. Only samples with European descent, an overall call rate higher than 95%, and variants with minor allele frequency (MAF) greater than 5% were included in the analyses.

Single variant analysis
The three-stage single variant analysis was performed due to differences in time and platform for biomarker quantification. PLINK1.9 [16,56] was used to perform the analysis of each cohort independently. A linear model using the normalized and standardized CSF levels and corrected by sex, age, and the first two principal components, was used. Disease status was not included in the model [25]. Then, the results for each protein were metaanalyzed using METAL [69]. For the α-Syn analyses, the WUSTL cohort was divided into two subsets based on the quantification method (ELISA or SOMAscan).

Analysis of variance
The genome-wide complex traits analysis (GCTA) software [71] was used to calculate the amount of variance explained by the APOE locus. GCTA estimates the phenotypic variance explained by genetic variants for a complex trait by fitting the effect of these SNPs as random effects in a linear mixed model.

Multi-tissue analysis
The levels of α-Syn were measured in CSF, plasma, and brain (parietal cortex) using an aptamer-based approach (SOMAScan platform) [70]. After stringent quality control, CSF (n = 835), plasma (n = 529), and brain (n = 380) samples were included in the downstream analyses (Additional file 2: Table S2). The protein level was 10-based log-transformed to approximate the normal distribution and used as phenotype for the subsequent GWAS. The single variant analysis was performed in each tissue independently using PLINK1.9 [56]. A multi-tissue analysis using the multi-trait analysis of GWAS (MTAG) [66] was applied to increase the power of detecting a no tissuespecific protein quantitative trait loci for α-Syn. MTAG calculates the trait-specific effect estimate for each tissue separately and then performs a meta-analysis while accounting for sample overlap. Measurements of Aβ42, t-tau, and p-tau 181 were not available in different tissues.

Polygenic risk score
PRS is constructed by summing all trait-associated alleles in a target sample (META-PD and CSF biomarkers separately), weighted by the effect size of each allele in a base using different p-value thresholds. SNPs in linkage disequilibrium (LD) are grouped together to avoid extra weight into a single marker. The optimal threshold is considered the one that explains the maximum variance in the target sample. The association was tested using the default parameters and nine p-value cutoffs. The PRSice2 software [17] was used to calculate the PRS. Longitudinal measures of CSF α-Syn, Aβ42, t-tau, and p-tau 181 were available for the PPMI cohort. A simplified PRS (detailed below) was used to test if the genetic architecture of PD was predictive of biomarker level progression. The PD PRS using sentinel SNPs from the META-PD [54] was modeled using the method previously described [19,41,42,68]. Briefly, only genetic variants corresponding to the top hit on each GWAS locus (also known as sentinel SNP) available in the dataset with a minimum call rate of 85% were included in the PRS. If not possible, a proxy with R 2 > 0.90 was used. The weight of each variant was calculated using the binary logarithm transformation of the reported Odd ratios. The final PRS is the sum of the weighted values for the alternate allele of all the sentinel SNPs.

Mendelian randomization
MR requires that the genetic instruments are associated with the modifiable exposure of interest (GWAS of CSF biomarkers), and any association between the instruments and the outcome (PD risk) is mediated by the exposure [11]. A two-sample MR was used to estimate causal effects using the Wald ratio for single variants along with an inverse-variance-weighted (IVW) fixed-effects meta-analysis for an overall estimate [36]. The IVW estimate is the inverse variance weighted mean of ratio estimates from 2 or more instruments. Twosample MR provides an estimate of the causal effect of an exposure on an outcome, using independent samples to obtain the gene-exposure and gene-outcome associations, provided three key assumptions: (i) genetic variants are robustly associated with the exposure of interest (i.e. replicate in independent samples), (ii) genetic variants are not associated with potential confounders of the association between the exposure and the outcome and (iii) there are no effects of the genetic variants on the outcome, independent of the exposure (i.e. no horizontal pleiotropy). To account for potential violations of the assumptions underlying the IVW analysis, a sensitivity analysis using MR-Egger regression and the weighted median estimator was performed [36]. MR Egger regression consists of a weighted linear regression of SNP META-PD against SNP biomarker effect estimates.
Assuming that horizontal pleiotropic effects and SNP exposure associations are uncorrelated (i.e., the instrument strength independent of direct effects assumption), MR Egger regression provides a valid effect estimate even if all SNPs are invalid instruments. Moreover, the MR Egger intercept can be interpreted as a test of overall unbalanced horizontal pleiotropy because one would expect a null y-intercept (i.e., the mean value of the SNP META-PD associations when the SNP biomarker association is zero) if there are no horizontal pleiotropic effects. Robust regression to downplay the contribution to the causal estimate of instrumental variables with heterogeneous ratio estimates were also performed [10,12]. Heterogeneity (i.e., instrument strength) was tested using the I 2 statistic. I 2 statistic, instead of F statistic, is a better indicator of instrument strength for the two-sample summary data approach [6]. The R package "Mendelian-Randomization" [72] (version 0.4.1) was used for the MR analyses.
The latest and largest meta-analysis for PD genetic risk was used to perform the MR analyses [54]. Summary statistics from the largest GWAS of CSF Aβ42, t-tau, and p-tau 181 were also used [25]. Deming et al. performed a one-stage GWAS for 3146 NHW individuals across nine independent studies [25]. None of these cohorts included PD affected individuals for each biomarker (Aβ42, t-tau, and p-tau 181 ). Finally, the summary statistics of the GWAS for α-Syn CSF levels generated in the current study were used. There was no overlap between CSF biomarker datasets and PD risk datasets. Instrumental variables for each GWAS were obtained by clumping each GWAS summary statistics based on the LD structure of the exposure (CSF biomarker levels) and a significance threshold of 1.0x10 −5 using PLINK1.9 [56]. Instrumental variables were restricted to those that are uncorrelated (in linkage equilibrium) by setting the -clump-r2 flag to 0.0 and the -clump-kb flag to 1000 (1 Mb).

No significant loci were identified for CSF α-Syn, t-tau or p-tau 181 in Parkinson's disease cohorts
Within each cohort, a linear regression testing the additive genetic model of each SNP for association with CSF protein levels using age, gender, and two principal component factors for population stratification as covariates did not reveal any genome-wide significant loci associated with CSF α-Syn. Although several suggestive loci (p < 10 −6 to 10 −8 ) were identified in these analyses (Additional file 1: Fig. S1 and Additional file 2: Table S3), none of them passed multiple test correction threshold when cohorts were combined in the meta-analysis ( Fig. 2a and Additional file 2: Table S3).
Joint analysis for CSF α-Syn levels stratifying by PD cases (N = 700), PD cases and controls (N = 889), AD cases only (N = 386), AD cases and controls (N = 575) and controls only (N = 189) were also performed. None of these analyses revealed any genome-wide significant locus, suggesting that these sample sizes might be underpowered to uncover the genetic modifiers of CSF α-Syn.
For t-tau, individual cohort analyses revealed four genome-wide significant loci (Additional file 1: Fig. S3 and Additional file 2: Table S5). However, none of them remained significant in the meta-analyses (Fig. 2b and Additional file 2: Table S5). For p-tau 181 , individual cohort analyses revealed three genome-wide significant loci (Additional file 1: Fig. S4 and Additional file 2: Table S6). However, none achieved significance in the meta-analyses ( Fig. 1c and Additional file 2: Table S6).

Genetic analyses of multi-tissue α-Syn levels
In a subgroup of samples, α-Syn levels were measured in plasma (N = 529), brain (N = 380), and CSF (N = 835) using the SOMAScan platform (Additional file 2: Table S2). Single variant analysis was performed in each tissue separately (Additional file 1: Fig. S2A to 2C). Multi-tissue analysis was performed using MTAG [66]. Although two suggestive loci were observed in chromosomes 3 and 13 (Additional file 1: Fig. S2D and Additional file 2: Table S4) within genomic regions enriched with long intergenic non-protein coding (LINC) genes (Additional file 1: Fig. S2E and F), no genome-wide significant locus was identified. These results suggest that the power boost of using MTAG is not enough to unveil the genetic architecture of α-Syn.

APOE locus is associated with Aβ42 CSF levels in Parkinson's disease cohorts
A proxy SNP for APOE ε4, rs769449, was associated with CSF levels of Aβ42 in the WUSTL (effect = − 0.56, p = 4.15 × 10 −19 ), and ADNI cohorts (effect = − 0.73, p = 1.25 × 10 −15 ). This association did not pass the genome-wide multiple test correction threshold in the PPMI cohort (effect = − 0.43, p = 3.09 × 10 −07 ) and was not significant in the Spanish cohort (Additional file 1: Fig. S5 and Additional file 2: Table S7). The APOE locus (effect = − 0.57, p = 4.46 × 10 −43 ) and a locus in the HLA region (effect = 0.23, p = 2.88 × 10 −08 ) remained significant in the meta-analysis (Fig. 2d-f ). When the cohorts containing only PD cases and controls were analyzed jointly (WUSTL and PPMI -N = 700 cases and 189 controls), the APOE locus was GWAS significant (effect = − 0.50, p = 9.25 × 10 −19 ) but not the HLA region (effect = 0.22, p = 3.58 × 10 −04 ). In the combined analysis of all cohorts (N = 1960), the APOE locus accounted for 36.2% of the CSF Aβ42 levels variance (p = 2.35 × 10 −03 ). Overall, these results revealed a strong and highly  . e, f Regional association plots of loci are shown for SNPs associated with CSF Aβ42 levels near HLA (e) and near APOE locus (f). The SNPs labeled on each regional plot had the lowest p-value at each locus and are represented by a purple diamond. Each dot represents an SNP, and dot colors indicate linkage disequilibrium with the labeled SNP. Blue vertical lines show the recombination rate marked on the right-hand y-axis of each regional plot. Suggestive SNPs for α-Syn, t-tau, p-tau 181 can be found in Additional file 2: Tables S3 to S6 Ibanez et al. acta neuropathol commun (2020) 8:196 significant association between APOE locus and lower CSF Aβ42 levels in PD cohorts.

Significant correlation of genomic architecture of Parkinson's disease risk and CSF Aβ42
PRS at different p-value thresholds were used to test if the genetic variants associated with dementia biomarkers were associated with the genomic architecture of PD. PRS calculated using the META-PD [54] were associated with PD status in the WUSTL cohort (N = 108; p = 0.035). The PPMI cohort was excluded from this analysis due to overlap with META-PD. No correlation was observed between the genetic architecture of PD and that of CSF α-Syn, t-tau, or p-tau 181 levels (Fig. 3). In contrast, the genetic architecture of CSF Aβ42 was correlated with PD, with the best fit when collapsing independent SNPs with p-value < 0.01 (p = 2.50 × 10 −11 ) with a correlation coefficient (R 2 ) of 2.29%. In PD cases and controls only, the correlation remained significant (p = 4.78 × 10 −08 ), with an R 2 of 2.36%. In PD patients with both GWAS and CSF biomarker data, the CSF levels of each biomarker were analyzed by quartiles of the PRS calculated from META-PD risk. A significant difference (p = 7.30 × 10 −04 ) was found among the top and the bottom quartiles; higher PRS values exhibit lower levels of CSF Aβ42 (Additional file 1: Fig. S6). No association between PD PRS and longitudinal changes of α-Syn, Aβ42, t-tau, and p-tau 181 levels was found in the PPMI dataset. These results indicate that PD and Aβ42 CSF levels have a shared genomic architecture.

Mendelian randomization suggest a causal link between CSF Aβ42 and Parkinson's disease
Robust regression with the MR-Egger method found no association for t-tau or p-tau 181 levels but revealed a trend for CSF α-Syn levels (effect = − 1.40; p = 0.06), and a significant causal effect for CSF Aβ42 on PD (effect = 0.43; p = 1.44 × 10 −05 ) (Fig. 4a-c and Additional file 2: Table S7; Table 2 and Additional file 2: Table S8) (Table 2 and Additional file 2: Table S8). Additionally, a significant causal effect for CSF Aβ42 on PD age-at-onset was found using the data from Blauwendraat et al., 2019 (effect = 7.75; p = 7.65 × 10 −06 - Table 2 and Additional file 2: Table S8). A leave-one-out sensitivity analysis on CSF Aβ42 revealed that the proxy SNP for APOE ε4, rs769449 is the strongest instrumental variable of this analysis (I 2 is greater than 90% except when this variant was removed) and the main driver of the causal effect of CSF Aβ42 on PD. Other SNPs contribute in a smaller proportion to the causal effect (Fig. 4d). Altogether these results suggest a causal role of SNPs on the APOE locus and CSF Aβ42 on PD.

Discussion
CSF α-Syn, Aβ42, t-tau, and p-tau 181 levels were significantly lower in PD cases compared with controls, as we previously reported with a smaller sample size [9]. GWAS were performed using CSF biomarker levels as quantitative traits in a large cohort (N = 1,960). With the current sample size, no signal was below the GWAS significant threshold for CSF α-Syn, t-tau, or p-tau 181 . A SNP proxy for APOE ε4 was genome-wide associated with CSF Aβ42 levels. The PRS calculated using META-PD was associated with PD status and correlated with the genomic architecture of CSF Aβ42; in fact, individuals with higher PRS scores exhibit lower CSF Aβ42 levels. Two-sample MR analysis revealed that CSF Aβ42 probably plays a role in PD and PD age-at-onset, an effect mainly mediated by variants in the APOE locus. Using a subset of participants from the WUSTL cohort with additional clinical and neuropathological data, we found that the APOE ε4 allele was associated with lower levels of CSF Aβ42, higher cortical binding of PiB PET and higher Braak Aβ score. This is the first comprehensive analysis of CSF α-Syn and AD biomarkers using GWAS, PRS, and MR in PD. We found lower levels of CSF α-Syn in PD cases compared to controls in a cross-sectional analysis but no significant differences in the longitudinal study (PPMI). CSF α-Syn, as measured with ELISA-based assays, is not a clinically useful diagnostic marker for PD, and utility as an outcome measure for clinical trials or progression is still controversial [35,53]. CSF biomarkers in AD used as quantitative endophenotypes have provided insights into AD pathophysiology [24]. Here, we used a large CSF α-Syn cohort (N = 1920) to identify its genetic modifiers. However, we did not find any locus associated with CSF α-Syn levels. Recently, a GWAS on CSF α-Syn using the ADNI cohort (N = 209) reported a genome-wide significant locus [73] (rs7072338). In the present meta-analyses (N = 1960), the p-value for rs7072338 was not significant (0.99). In the ADNI cohort, we found a nominal Fig. 4 MR regressions on Parkinson's disease risk genetic architecture and CSF α-Syn and Aβ42 levels. a Association between META-PD risk and CSF α-Syn levels (four variants). Robust regression MR-Egger method effect = -1.40 and p = 0.06, which is not consistent with causality. b Association between Parkinson's disase risk and CSF Aβ42 levels (twelve variants). Robust regression with MR-Egger method effect = 0.43 and p = 1.44 × 10 −05 , which is consistent with causality. Each dot corresponds to one genetic variant, with a 95% confidence interval (CI) of its genetic association with the exposure (α-Syn and Aβ42 levels) and the outcome (Parkinson's disease risk). Regression lines correspond to the robust MR-Egger method regression; numerical results are given for all tested methods in Additional file 2: Table S8. c CSF Aβ42 regression using multiple MR methods. Each dot is one of the twelve variants included in this test; the effect of CSF Aβ42 levels on the x-axis and Parkinson's disease risk on the y-axis. Each line represents the regression of one MR-method of CSF Aβ42 levels on Parkinson's disease risk with one MR method. Additional details on the data sources and analysis methods to generate these figures are provided in Additional file 2: Table S8. d The forest plot illustrates the leave-one-out sensitivity analysis between CSF Aβ42 and META-PD risk. MR analysis without rs769449 decreased the I 2 statistic (I 2 = 0.0%) and increased the p-value to non-significant levels, suggesting that the association is mainly driven by this variant Ibanez et al. acta neuropathol commun (2020) 8:196 association for this SNP (p = 0.50 × 10 −3 ). No correlation was found between the genetic architecture of PD with cross-sectional or longitudinal CSF α-Syn levels, consistent with what we have previously reported [41].
Using MR methods, we found a trend for the association between the CSF α-Syn levels and the risk of developing PD. However, sensitivity analyses showed limited power due to the small number of variants included in the analyses. MR analyses suggest that Aβ42 could play a causal role in PD. Our MR results consistently identified a causal correlation between the APOE locus, CSF Aβ levels, and PD. MR is used to test if the genetic variation associated with a trait has a causal relationship with a health outcome [20]. MR is not affected by confounding factors or reverse causation, like in observational studies. However, the proper implementation of MR depends on several assumptions [20]. Here, instrumental variables (SNPs) relevant to CSF Aβ were previously and consistently identified [24]. A second MR assumption is independence; SNPs associated with the trait (e.g. APOE locus with CSF Aβ) should not be associated with the outcome (PD risk). The third MR assumption is the exclusion restriction, which means that SNPs do not affect PD risk except through CSF Aβ levels. The two sample MR used here requires two additional criteria: both cohorts must have similar genetic background but no overlap with each other. Here, samples used for the MR analysis, summary statistics from Deming, et al. [24] and Nalls, et al. [54], met both criteria. We could not rule out a horizontal pleiotropic effect of all the SNPs associated with CSF Aβ with PD, but our study is powered to detect the causal association with the APOE locus. Thus, we inferred that the lifetime effect of the APOE locus is causal in relation to PD.
The APOE locus and CSF Aβ42 levels were GWAS significant in the meta-analysis. The association of the APOE locus with CSF Aβ42 levels has been previously reported in AD [48] but not in PD cohorts. Interestingly, the direction of the effect was the same as what has been reported in AD but with a higher effect size (− 0.57 in PD compared to − 0.10 in AD) [25]. The APOE locus is the most significant locus associated with sporadic LBD, [59,60], and cognitive decline in PD [40], but not with PD risk [54]. Here, we also found for the first time that patients with higher PRS from PD risk exhibit lower levels of CSF Aβ42, suggesting that similar genes or pathways predispose individuals to an accumulation of Aβ in the brain and to develop PD. This is in agreement with a recent report suggesting that the PD genes from the PRS analysis are enriched for AD genes [2].
The results from unbiased analyses like GWAS, PRS and MR demonstrated a link between PD genetic risk with CSF Aβ42 levels and the APOE locus. Here, we   N = 156). c PD patients carrying the APOE ε4 allele exhibit a higher Braak Aβ score than non-carriers (N = 92). Differences between APOE ε4 carriers and non-carriers were statistically significant by the Mann-Whitney U test also provided further evidence by showing that PD patients carrying the APOE ε4 allele presented with lower levels of CSF Aβ42 (p = 3.8 × 10 −06 ), higher MCBP (p = 5.80 × 10 −08 ) and higher Braak Aβ scores (p = 4.40 × 10 −04 ). These results support the synergistic relationship between α-Syn and Aβ pathology in AD, PD and LBD brains [43], and the effect of Aβ plaques exacerbating the propagation of α-Syn pathology in mouse models [3]. It is known that APOE ε4 drives the production of Aβ, the accumulation of Aβ fibrils in AD patients [37], exacerbates tau-mediated neurodegeneration in a mouse model of tauopathy [62] and affects CSF αSyn levels in the prodromal phase of sporadic and familial AD [67]. However, the role of APOE in human synucleinopathies is probably more complex. In LBD patients, the APOE ε4 effect on α-Syn pathology could be dependent on concurrent Aβ and/or tau pathology [58], however APOE ε4 also promotes α-Syn pathology independently [27,65] and affects CSF αSyn levels [67]. We recently showed that APOE ε4 increased the α-Syn phosphorylation, worsened motor impairment, and increased neuroinflammation and neurodegeneration in different mouse models [22]. This is the largest sample size used for discovering CSF α-Syn genetic modifiers to date and yet no GWAS significant locus was found. It is possible that the complexity of α-Syn genetic architecture makes the current sample size insufficiently powered to detect signals with a smaller effect. Here, we found lower levels of CSF α-Syn in PD patients, which aligns with previous reports. However, neither PRS nor MR analysis revealed evidence of the causal link of CSF α-Syn with PD risk. In fact, it has been reported that α-Syn aggregation is neither necessary nor sufficient for neurodegeneration or clinical parkinsonism [31,32]. The cohorts used in this study rely on clinical diagnosis rather than neuropathological confirmation, which precludes analyses of a correlation between CSF α-Syn levels and pathologic brain accumulation of brain α-Syn. Factors that may have contributed to the lack of power to detect genetic modifiers of CSF α-Syn include participant characteristics (PD subtypes, misdiagnosis, comorbidities, medications, disease duration), preanalytical factors (blood contamination at lumbar puncture), and differences in assays (measuring various abnormal pathological or normal forms of α-Syn) [26].
PD is a heterogeneous disorder with different identifiable clinical-pathological subtypes based on symptom severity and predominance [15]. It is conceivable that more homogeneous PD subtypes could be defined using biomarker-driven, clinical-molecular phenotyping approaches. This study, with 1960 samples with CSF α-Syn levels, showed that the genomic architecture of α-Syn is complex and not correlated with the genomic landscape of PD. Additional studies with larger sample sizes and standardized methods to quantify α-Syn in both CSF and brain are needed to uncover genetic modifiers of α-Syn levels. Our results using high-throughput and hypothesis-free, unbiased approaches demonstrated a link between PD genetic risk, CSF Aβ42 levels and APOE locus. These findings were further validated by strong significant associations of APOE ε4 with Aβ deposition in cortical regions of living and postmortem PD patients.

Acknowledgements
We thank all the participants and their families, as well as the many institutions and their staff. We would like to thank Dr. Shonali Midha for providing insighful comments to the manuscript.

Data availability
All data is available in the Center for Neurogenomics and informatics (NGI) website (https ://neuro genom ics.wustl .edu/). The summary statistics for all the analyses can be easily explored in the Online Neurodegenerative Trait Integrative Multi-Omics Explorer (ONTIME) (https ://omics .wustl .edu) and the Charles F. and Joanne Knight Alzheimer Disease Research Center (https ://knigh tadrc .wustl .edu/resea rch/resou rcere quest .htm).

Ethics approval
The Institutional Review Boards of all participating institutions approved the study, and this research was carried out in accordance with the recommended