Skip to main content

Common gene expression signatures in Parkinson’s disease are driven by changes in cell composition


The etiology of Parkinson’s disease is largely unknown. Genome-wide transcriptomic studies in bulk brain tissue have identified several molecular signatures associated with the disease. While these studies have the potential to shed light into the pathogenesis of Parkinson’s disease, they are also limited by two major confounders: RNA post-mortem degradation and heterogeneous cell type composition of bulk tissue samples. We performed RNA sequencing following ribosomal RNA depletion in the prefrontal cortex of 49 individuals from two independent case-control cohorts. Using cell type specific markers, we estimated the cell type composition for each sample and included this in our analysis models to compensate for the variation in cell type proportions. Ribosomal RNA depletion followed by capture by random primers resulted in substantially more even transcript coverage, compared to poly(A) capture, in post-mortem tissue. Moreover, we show that cell type composition is a major confounder of differential gene expression analysis in the Parkinson’s disease brain. Accounting for cell type proportions attenuated numerous transcriptomic signatures that have been previously associated with Parkinson’s disease, including vesicle trafficking, synaptic transmission, immune and mitochondrial function. Conversely, pathways related to endoplasmic reticulum, lipid oxidation and unfolded protein response were strengthened and surface as the top differential gene expression signatures in the Parkinson’s disease prefrontal cortex. Our results indicate that differential gene expression signatures in Parkinson’s disease bulk brain tissue are significantly confounded by underlying differences in cell type composition. Modeling cell type heterogeneity is crucial in order to unveil transcriptomic signatures that represent regulatory changes in the Parkinson’s disease brain and are, therefore, more likely to be associated with underlying disease mechanisms.


Parkinson’s disease (PD) is the second most prevalent neurodegenerative disorder, affecting ~ 1.8% of the population above 65 years [45]. PD is a complex disorder caused by a combination of genetic and environmental factors, but the molecular mechanisms underlying its etiology remain largely unaccounted for. Genome-wide transcriptomic studies can identify expression signatures associated with PD. While not able to establish causality, these studies hold the potential to highlight important biological mechanisms, some of which may be exploited as targets for therapeutic modulation.

A recent systematic review identified 33 original genome-wide transcriptomic studies in the PD brain, of which 5 were performed on laser microdissected neurons from the substantia nigra pars compacta (SNc) and the remaining in bulk tissue from various brain regions [8]. These studies show surprisingly low replicability at the level of individual genes, however, and only partial concordance for pathways. The most consistent alterations have been found in pathways related to energy metabolism/mitochondrial function and protein degradation, followed by synaptic transmission, vesicle trafficking, lysosome/autophagy and neuroinflammation [8]. While these processes commonly show differential expression signatures in PD, it remains unknown whether this is because they truly reflect the biology of PD or due to systematic bias and confounding factors. Two major sources of bias for transcriptomic studies in the human brain are the post-mortem degradation of RNA and the highly heterogeneous cell type composition of bulk tissue samples.

RNA degradation of variable extent occurs in post-mortem tissue. To further complicate the picture, it has been shown that different cell types exhibit different degrees of susceptibility to RNA degradation [32], potentially confounding differences in cellular composition with differences in RNA quality. Access to high-quality brain tissue is generally limited, and thus an optimal choice of experimental platforms becomes paramount to maximize sensitivity. While RNA microarrays are being gradually superseded by RNA-seq technology, only 3 out of the 33 studies identified by an up-to-date review [8] used RNA-seq, and all of them employed poly(A) capture, a widely used protocol (in both RNA-seq and microarray analyses) to restrict the analysis to mature mRNA [20, 30, 46]. However, this library preparation method only picks up RNA fragments with a poly-A tail, introducing substantial bias in low quality RNA samples [1, 25, 47, 56]. A well-established approach to mitigate this limitation is whole RNA-seq following active ribosomal RNA (rRNA) depletion and capture by random primers, such as the Illumina Ribo-Zero technique [31]. To our knowledge there are no genome-wide transcriptomic studies on PD brain employing active rRNA depletion methods to date.

Systematic differences in sample cell composition represent another important confounding factor. These typically originate from two sources: biological differences (e.g. secondary to neurodegeneration) and technical variation in sample dissection and preparation. Brain areas affected by neurodegeneration are characterized by neuronal loss and gliosis, resulting in a systematically increased glia-to-neurons ratio in patients. This confounder is strongest in areas with severe changes, such as the SNc, but is also present to a variable degree in less affected areas, such as the neocortex. In addition, technical sources of variation due to sampling may affect any brain region and cause an uneven distribution of gray and white matter, resulting in a variable fraction of oligodendrocytes. Thus, transcriptional signatures associated with PD in bulk brain tissue may reflect changes in cellular composition rather than disease-specific transcriptional modulation. This observation has already been put forward using neurodegenerative mouse models and re-analysis of human brain transcriptomic data [50]. Heterogeneous cell composition is, hence, a major confounder that needs to be considered and appropriately addressed in transcriptomic studies in bulk brain samples.

We report the first genome-wide transcriptomic study in the PD brain employing RNA-seq following rRNA depletion and random primer capture. We show that this approach is able to substantially mitigate the bias of post-mortem degradation, resulting in substantially better transcript coverage compared to poly(A) capture. Moreover, by estimating the relative cell type proportion in our samples, we confirm that cellular composition is a major source of variation in bulk tissue data, confounding the differential gene expression profile even in the less affected prefrontal cortex. By incorporating the estimated cell type proportions into our analysis models, we were able to unveil transcriptomic signatures which are more likely to be associated with the underlying disease mechanisms.

Material and methods

Subject cohorts

All experiments were conducted in fresh-frozen prefrontal cortex (Brodmann area 9) from a total of 49 individuals from two independent cohorts. The first cohort (n = 29) comprised individuals with idiopathic PD (n = 18) from the Park-West study (PW), a prospective population-based cohort which has been described in detail [2] and neurologically healthy controls (Ctrl, n = 11) from our brain bank for aging and neurodegeneration. Whole-exome sequencing had been performed on all patients [24] and known/predicted pathogenic mutations in genes implicated in Mendelian PD and other monogenic neurological disorders had been excluded. None of the study participants had clinical signs of mitochondrial disease or use of medication known to influence mitochondrial function (Additional file 1). Controls had no known neurological disease and were matched for age and gender. The second cohort comprised samples from 21 individuals from the Netherlands Brain Bank (NBB) including idiopathic PD (n = 10) and demographically matched neurologically healthy controls (n = 11). Individuals with PD fulfilled the National Institute of Neurological Disorders and Stroke [26] and the UK Parkinson’s disease Society Brain Bank [54] diagnostic criteria for the disease at their final visit. Ethical permission for these studies was obtained from our regional ethics committee (REK 2017/2082, 2010/1700, 131.04).

To investigate the effect of the rRNA depletion and random primer capture protocol compared to the prevailing poly(A) method, we re-analyzed an RNA-seq dataset from a previous publication which employed a poly(A) tail selection kit on post-mortem tissue of the same brain area and same disease (PA cohort, n = 29 PD samples, n = 44 neurologically healthy controls, all males; GEO: GSE68719) [20]. Informed consent was available from all individuals.

Tissue collection and neuropathology

Brains were collected at autopsy and split sagittaly along the corpus callosum. One hemisphere was fixed whole in formaldehyde and the other coronally sectioned and snap-frozen in liquid nitrogen. All samples were collected using a standard technique and fixation time of ~ 2 weeks. There was no significant difference in post-mortem interval (PMI) (independent t-test, PW cohort p = 0.16; NBB cohort p = 0.92), age (independent t-test, PW cohort p = 0.18; NBB cohort p = 0.074) or gender (independent t-test, PW cohort p = 0.94; NBB cohort p = 0.53) between PD subjects and controls. Subject demographics and tissue availability are provided in Additional file 1. Routine neuropathological examination including immunohistochemistry for α-synuclein, tau and beta amyloid was performed on all brains. All cases showed neuropathological changes consistent with PD including degeneration of the dopaminergic neurons of the SNc in the presence of Lewy pathology. Controls had no pathological evidence of neurodegeneration.

RNA sequencing

Total RNA was extracted from prefrontal cortex tissue homogenate for all samples using RNeasy plus mini kit (Qiagen) with on-column DNase treatment according to manufacturer’s protocol. Final elution was made in 65 μl of dH2O. The concentration and integrity of the total RNA was estimated by Ribogreen assay (Thermo Fisher Scientific), and Fragment Analyzer (Advanced Analytical), respectively and 500 ng of total RNA was used for downstream RNA-seq applications. First, rRNA was removed using Ribo-Zero™ Gold (Epidemiology) kit (Illumina, San Diego, CA) using manufacturer’s recommended protocol. Immediately after the rRNA removal the RNA was fragmented and primed for the first strand synthesis using the NEBNext First Strand synthesis module (New England BioLabs Inc., Ipswich, MA). Directional second strand synthesis was performed using NEBNExt Ultra Directional second strand synthesis kit. Following this the samples were taken into standard library preparation protocol using NEBNext® DNA Library Prep Master Mix Set for Illumina® with slight modifications. Briefly, end-repair was done followed by poly(A) addition and custom adapter ligation. Post-ligated materials were individually barcoded with unique in-house Genomic Services Lab (GSL) primers and amplified through 12 cycles of PCR. Library quantity was assessed by Picogreen Assay (Thermo Fisher Scientific), and the library quality was estimated by utilizing a DNA High Sense chip on a Caliper Gx (Perkin Elmer). Accurate quantification of the final libraries for sequencing applications was determined using the qPCR-based KAPA Biosystems Library Quantification kit (Kapa Biosystems, Inc.). Each library was diluted to a final concentration of 12.5 nM and pooled equimolar prior to clustering. One hundred twenty-five bp Paired-End (PE) sequencing was performed on an Illumina HiSeq2500 sequencer (Illumina, Inc.). RNA quality, as measured by the RNA integrity number (RIN), varied across samples (mean = 5.3, range = 3.0–7.2 for PW; mean = 6.8, range = 3.2–9.1 for NBB), although the difference between conditions did not reach statistical significance in any of the cohorts (t-test P = 0.72 and 0.90 for PW and NBB cohorts, respectively).

Data quality control

FASTQ files were trimmed using Trimmomatic version 0.36 [7] to remove potential Illumina adapters and low quality bases with the following parameters: ILLUMINACLIP:truseq.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15. FASTQ files were assessed using fastQC version 0.11.5 [3] prior and following trimming. For an in-depth quality assessment, we mapped the trimmed reads using HISAT2 version 2.1.0 [34] against the hg19 human reference genome (using --rna-strandness RF option) preserving lane-specific information. To discard potential lane-specific sequencing batch effects we inspected the output of the CollectRnaSeqMetrics tool of Picard Tools version 2.6 [11]. Mapping efficiency and proportion of reads mapping to rRNA, intronic, intergenic and coding regions were obtained from the output of the CollectRnaSeqMetrics (Additional file 2: Figure S1 and S2).

For the poly(A) capture dataset [20], raw FASTQ files were obtained from the Gene Expression Omnibus (GEO:GSE68719) and analyzed exactly as described for our cohorts (with the exception of --rna-strandness in HISAT2, which was turned off to take into account that the cDNA library of this cohort was unstranded).

RNA expression quantification and filtering

We used Salmon version 0.9.1 [43] to quantify the abundance at the transcript level with the fragment-level GC bias correction option (−-gcBias) and the appropriate option for the library type (−l ISR) against the Ensembl release 75 transcriptome. Transcript-level quantification was collapsed onto gene-level quantification using the tximport R package version 1.8.0 [49] according to the gene definitions provided by the same Ensembl release. We filtered out genes in non-canonical chromosomes and scaffolds, and transcripts encoded by the mitochondrial genome. To further reduce the potential for artifacts we filtered out transcripts with unusually high expression by removing transcripts that gathered more than 1% of the reads on more than half of the samples, which resulted in the removal of 3 and 4 transcripts from the PW and NBB cohorts, respectively. Additionally, low-expressed (i.e. genes whose expression was below the median expression in at least 20% of the samples) were filtered out from downstream analyses. Samples were then marked as outliers if their median correlation in gene expression (log counts per million) with the other samples was below Q1–1.5*IQR or above Q3 + 1.5*IQR (Tukey’s fences; Q1: first quartile, Q3: third quartile, IQR: inter-quartile range). As a result, 3 samples were marked as outliers in the PW cohort and 3 in the NBB cohort, and were not included in downstream analyses (resulting sample sizes: NPW = 26, NNBB = 18, Additional file 2: Figure S3).

Estimation of marker gene profiles

It has been previously shown that cell type-specific transcriptional signature patterns derived from bulk tissue samples (marker gene profiles, MGPs), can be used as surrogates for relative cell type abundance across samples [37]. MGPs for each cell type are calculated individually, by summarizing the concordant change in their respective marker genes via the first principal component of their expression (i.e. log-transformed counts per million (CPMs)). For the purpose of our study, we calculated MGPs for the main cortical cell types (neurons, oligodendroglia, microglia, endothelial cells, and astrocytes). Cortical cell type markers were obtained from the NeuroExpresso database [37], a comprehensive database compiled using mouse brain cell type expression datasets, and human orthologs were defined using HomoloGene [38]. To reduce the impact of outlier samples, principal component analysis was repeated 100 times on subsampled data, containing an equal number of subjects per group, and removing markers with opposite sign of the main trend. The median score for each sample was used as MGP for the downstream analyses. MGPs obtained with Neuroexpresso-based markers were highly correlated with MGPs calculated using two independent sets of markers from human brain single-cell transcriptomic studies [33, 53] (Additional file 2: Figures S4–8, Additional file 2: Table S1). To assess potential variations associated with the disease across the neuronal markers, we examined the overlap between the markers and the differentially expressed genes in four publicly available datasets of laser microdissected neurons from PD brain (SNc dopaminergic neurons [13, 22, 48] and posterior cingulate cortex pyramidal neurons [51]). We found minimal overlap (3/78 genes) between our neuronal markers and genes differentially expressed in PD dopaminergic neurons. Moreover, none of the markers were differentially expressed in PD cortical neurons [51] (Additional File 2: Figure S9). The vast majority of the cell type markers used for the calculation of MGPs changed in the same direction across our samples (Additional File 2: Figure S9), indicating that MGPs truly represent changes in global cell type-specific transcription profiles, rather than being driven by changes in specific genes.

To unravel potential complex interactions between MGPs and other experimental covariates, including disease status, we calculated the pairwise correlation between all the variables and also their association with the main axes of variation of gene expression. To assist us in choosing an optimal set of MGPs to include as covariates, we quantified the group differences in the cellular proportions between PD and controls using linear models adjusting for the known experimental covariates (i.e. RIN, PMI, sex, age, and sequencing batch). Significant association with disease status was found for oligodendrocyte MGP in the PW cohort and for microglia in the NBB cohort. Thus, these were included in the downstream analyses.

Differential gene expression and functional enrichment analyses

We performed differential gene expression analyses using the DESeq2 R package version 1.22.2 [35] with default parameters. Experimental covariates (sex, age, RIN, PMI, and sequencing batch) as well as oligodendrocyte and microglia MGPs were incorporated into the statistical model. Multiple hypothesis testing was performed with the default automatic filtering of DESeq2 followed by false discovery rate (FDR) calculation by the Benjamini-Hochberg procedure. Analyses were carried out independently for the two cohorts. Genes were scored according to their significance by transforming the p-values to account for direction of change. For each gene, the up-regulated score was calculated as \( {S}_{\mathrm{up}}=\left\{\begin{array}{c}1-p/2, LFC<0\\ {}p/2, LFC\ge 0\end{array}\right. \), and the down-regulated score as Sdown = 1 − Sup, where LFC corresponds to the log fold change and p to the nominal p-value of the gene. Genes were then tested for enrichment using alternatively log(Sup) and log(Sdown) scores employing the gene score resampling method implemented in the ermineR package version 1.0.1 [39], an R wrapper package for ermineJ [27] with the complete Gene Ontology (GO) database annotation [5] to obtain lists of up- and down-regulated pathways for each cohort.

In order to characterize the main biological processes affected by the cell type correction, we scored pathways based on the loss of significance caused by the addition of cellular estimates to the gene expression model. We quantified the difference in the level of significance in the up- and down-regulated enrichment results for each significant pathway as  = log(p0) − log(pCT), where pCT and p0 are the corrected enrichment p-values for the model with cell types (CT) and without (0), respectively. Only pathways that were significant in either one of the models were analyzed in this manner (p0 < 0.05 or pCT < 0.05).

The source code for the analyses is available in the GitLab repository ( under the GPL public license v3.0.


Ribo-zero is superior to poly(A) selection in post-mortem brain

We carried out RNA-seq using rRNA depletion and random primer capture (henceforth referred to as Ribo-Zero) in fresh-frozen prefrontal cortex (Brodmann area 9) from a total of 49 individuals from two independent cohorts: the Norwegian ParkWest study (PW, n = 29) [2] and the Netherlands Brain Bank (NBB, n = 21). Comparison of our data to a published poly(A) capture dataset of similar characteristics [20] (PA cohort) revealed important differences of mapping coverage. Mapping efficiency was slightly higher in the poly(A) dataset (PA: median = 0.976, range = 0.971–0.980) compared to the Ribo-Zero datasets (PW: median = 0.952, range = 0.940–0.962; NBB: median = 0.959, range = 0.947–0.965). The counts per million (CPM) of rRNA regions, as defined by Ensembl release 75, was very low in all samples (PW: median = 3099, range = 1047-7071; NBB: median = 1583, range = 1129-5024) and, as expected, significantly lower in the Ribo-Zero cohorts compared to the poly(A) dataset (PA: median = 40,058, range = 10,701-95,183) (Additional file 2: Figure S1).

In both datasets, the RIN was positively correlated with mapping efficiency to mRNA regions, but not to intergenic and/or intronic regions (Additional file 2: Figure S2). Despite having higher mean RIN values, the PA cohort showed a marked unevenness of transcript body coverage compared to the Ribo-Zero cohorts (Fig. 1a). The median coefficient of variation in coverage was significantly lower in the Ribo-Zero cohorts and the 5′- and 3′-ends of the transcripts showed substantially better coverage compared to the PA cohort (Fig. 1b). Moreover, in the Ribo-Zero datasets both the 3′- and 5′-end coverage loss showed a significant inverse correlation with the RIN values. In contrast, RIN showed no correlation with the 5′-bias and a positive correlation with the 3′-bias in the PA dataset (Fig. 1c). Thus, Ribo-Zero results in substantially better and more even coverage of the transcriptome in post-mortem brain tissue, providing a better alternative to poly(A) capture and minimizing the prospect of transcript quantification biases downstream.

Fig. 1

Transcript coverage profiles of Ribo-Zero datasets compared to poly(A). a Heatmaps of transcript coverage in our two cohorts (PW, NBB) and a poly(A) dataset (PA). The y-axis shows samples sorted by RIN (top: lowest RIN; bottom highest RIN). The x-axis represents the transcript body percentiles (5′ to 3′). The shading for a given row represents the sample-normalized coverage averaged across all transcripts. b Boxplots for different coverage quality metrics: median 5′-bias, median 3′-bias and median coefficient of variation (CV) for each cohort. The bias metric is calculated by Picard tools on the 1000 most highly expressed transcripts and corresponds to the mean coverage of the 3′(or 5′)-most 100 bases divided by the mean coverage of the whole transcript. Values closer to 1 indicate absence of bias, while values departing from 1 indicate a coverage bias (asterisks indicate significance at (*) p > 0.05, (**) p ≤ 0.01, (***) p ≤ 0.001, (****) p ≤ 0.0001, Wilcoxon test). The same metrics are expanded in (c), with sample scatterplots showing RIN values against the coverage quality metrics. Linear regression trends are indicated with black lines. P-values for the F-statistic of the linear model are also shown in the panels. Panels are organized in columns (cohorts) and quality metrics (rows). CV = coefficient of variation; PW = ParkWest cohort; NBB = Netherlands Brain Bank cohort; PA = poly(A) cohort

Cell composition is a major confounder of gene expression in bulk brain samples

The observed gene expression profiles in bulk brain tissue can be dramatically influenced by differences in cellular composition. Such differences can be a result of variation in gray/white matter ratios introduced during tissue extraction, inter-subject variability or represent disease related alterations [14, 37, 52]. To study the contribution of various technical and biological sources of variation in our dataset we first estimated marker gene profiles (MGPs) for the major classes of cortical cell types (astrocytes, microglia, oligodendrocytes, endothelial cells and neurons) in our samples by summarizing the expression of the cell type-specific marker genes as previously described [37, 52]. Next, we examined the Pearson’s correlation between potential sources of biological variation in our data, including technical and demographic factors (RIN, PMI, sex, age, and disease status) and MGPs. MGPs for neuronal cell types were significantly anticorrelated with the other main cortical cell types in both cohorts (p < 0.05, Fig. 2a). In agreement with previous studies [6, 32], MGPs were also correlated with RNA quality. In both cohorts RIN was significantly correlated with neuronal (positive correlation) and astrocyte (negative correlation) MGPs. Significant negative correlation of RIN with microglia MGPs was observed in the NBB cohort (Fig. 2a). Most concerning was the detection of a significant association between the oligodendrocyte MGP and the disease status in the PW cohort (Fig. 2a). The main axis of variation in gene expression (which explained 44 and 45% of the total variance in PW and NBB, respectively) was significantly correlated with RNA quality and cellular composition in both cohorts (Fig. 2b), singling out RNA quality and cellular composition as the main drivers of transcriptional change in bulk brain tissue.

Fig. 2

Analysis of sample covariates. a Pearson correlation coefficients for each pair of variables are shown in correlograms. Sizes of the circles in the upper triangular of the correlograms are proportional to the Pearson correlation coefficient, with color indicating positive (blue) or negative (red) coefficients. The precise values for the Pearson coefficients are indicated in the lower triangular. Non-significant pairwise correlations (p ≥ 0.05) are represented with a cross. b Heatmaps showing the association between the sample variables with the first 5 principal components of the gene expression. Only significant p-values (p < 0.05) are shown (linear regression F-test). c Cell type estimates based on MGPs for the main cortical cell types controlling for all the experimental variables except disease status (i.e. sex, age, PMI, RIN, and sequencing batch). P-values calculated with Wilcoxon tests. PW = ParkWest cohort; NBB = Netherlands Brain Bank cohort

We next looked for differences in cellular proportions between PD and controls adjusting for the known experimental covariates. In the NBB cohort, PD subjects exhibited a significant increase in the microglia MGP (p = 0.015, Wilcoxon test), while a significant increase in the oligodendrocyte MGP (p = 5.5 × 10− 3, Wilcoxon test) was observed in PD subjects from the PW cohort. In both cohorts, these changes were accompanied by a non-significant decrease in neuronal MGPs (Fig. 2c).

MGPs of different cell types are not entirely independent from each other, since changes in one cell type can be accompanied by changes in other cell types. Thus, to ensure that neuronal, endothelial, and astrocyte MGPs do not differ between the groups, we re-estimated group differences in these MGPs while adjusting for the oligodendrocyte and microglia MGPs. This analysis showed no significant differences between the groups (Additional file 2: Figure S10). Therefore, only MGPs of oligodendrocytes and microglia were included in the statistical model of differential expression.

Differential gene expression

Differential gene expression analysis of a total of ~ 31,000 pre-filtered genes was carried out using experimental covariates (sex, age, PMI, RIN, and sequencing batch) with or without oligodendrocyte and microglia MGPs. In the PW cohort, 595 genes were defined as differentially expressed (FDR < 0.05) without adjusting for cell type composition. Inclusion of oligodendrocyte and microglia MGPs in the model decreased the number of differentially expressed genes to a total of 220. In total, 74 genes remained significant both with and without adjustment for cell type composition. No genes with FDR < 0.05 were identified in the NBB cohort, irrespective of adjustment for cell type composition. A list with the nominally significant genes overlapping between the two cohorts is provided in Additional file 3. Comprehensive results of differential expression analysis are available in Additional file 4.

Functional enrichment

Functional enrichment analysis of the differential gene expression results without MGP adjustment indicated 476 significantly enriched (FDR < 0.05) pathways in PW (107 up-regulated and 369 down-regulated) and 992 in NBB (421 up-regulated and 571 down-regulated). MGP adjustment reduced the number of significant pathways to 89 in PW (35 up-regulated and 54 down-regulated) and 248 in NBB (115 up-regulated and 133 down-regulated). Of these, 34 pathways replicated across the two cohorts. Concordant pathways comprised protein folding, ER-related processes and lipid oxidation (Fig. 3). The complete results are provided in Additional file 5.

Fig. 3

Functional enrichment. The treemap shows the concordant enriched pathways between PW and NBB cohorts accounting for experimental covariates and MGPs (same direction of gene expression change and FDR < 0.05). Pathways are grouped with a white border if their gene overlap is above 0.5 (Szymkiewicz–Simpson coefficient). Darker shades of red/blue represent lower enrichment p-values for up−/down-regulated pathways. Sizes of the rectangles are proportional to pathway sizes

As expected, scoring each pathway according to the change in p-value when accounting for cellularity, revealed a marked downplay of the relevant cell type-specific functions (Table 1). In the PW cohort, which was characterized by a skewed oligodendrocytes/neurons proportion, the function with the largest attenuation (i.e. increase in p-value) was seen for up-regulation of myelination and other oligodendrocyte related functions and for down-regulation of neuronal pathways. For NBB, accounting for cell-composition resulted in attenuation of immunity and neuronal pathways, consistent with the unbalanced microglial/neuronal proportions seen in that cohort (Table 1). Strikingly, pathways linked to mitochondrial respiration, including respiratory complex I, were among the down-regulated processes that lost statistical significance when controlling for cellularity. The attenuation of the mitochondrial signal was observed in both cohorts. Conversely, up-regulation of protein folding-related pathways gained significance in both cohorts (Table 2 and Fig. 3). Complete results are provided in Additional file 6.

Table 1 Loss of significance in enriched pathways
Table 2 Gain of significance in enriched pathways


We present the first genome-wide transcriptomic study in the PD brain employing whole RNA-seq after rRNA depletion and random primer capture (Ribo-Zero). Our findings show that PD-associated differential gene expression signatures in bulk brain tissue are influenced to a great extent by the underlying differences in cell type composition of the samples. Modeling cell type heterogeneity allowed us to highlight transcriptional signatures that are likely to represent aberrant gene expression within the cells of the PD brain, rather than changes in cell composition.

Our results suggest that the Ribo-Zero approach is superior to the more commonly used poly(A) method and allows for a more accurate mapping and quantification of the transcriptome in post-mortem brain tissue. The Ribo-Zero method provides substantially higher evenness of coverage and effectively mitigates the 3′- and 5′-end coverage bias associated with poly(A) capture. Ultimately, the unevenness of coverage will influence transcript quantification, affecting the sensitivity of the differential expression estimates. While these observations are in agreement with previous comprehensive reports [1, 25, 47, 56], we cannot rule out the contribution of experimental variables specific to each cohort, in addition to the RNA sequencing methodology. Furthermore, while the Ribo-Zero protocol shows advantages compared to the poly(A) method, it is certainly not sufficient to fully mitigate the impact of RNA degradation on transcript quantification.

Our study supports the notion that cell composition can be a major confounder in bulk brain tissue transcriptomics. We estimated the relative cell type abundance across our samples by calculating MGPs for the main cortical cell types. While MGPs do not provide a direct measure of cell counts, they are a validated and robust surrogate for cell type composition [37, 52]. Moreover, we show that MGPs are (1) highly consistent across three different single cell-based marker sets, (2) highly robust to marker gene outliers, and (3) not susceptible to PD-associated changes in gene expression. Taken together, these results indicate that MGPs reliably represent the general behavior of cell type-specific transcriptional signature in our data.

Our analyses indicate that the observed expression profiles in both cohorts were driven predominantly by a combination of technical factors associated with RNA quality, and differences in cellular composition between PD and controls. This difference was primarily due to oligodendrocytes in PW and microglia in NBB. Since oligodendrocyte proliferation is not a pathological feature of PD, it is plausible that the difference in oligodendrocyte MGPs in PW was due to technical variation in gray/white matter content introduced during tissue sampling. Microglial infiltration does occur in affected areas of the PD brain [18]. It is noteworthy, however, that increased microglial MGP was only observed in one of the cohorts (NBB), highlighting the biological heterogeneity of PD. Accounting for relative cell proportions reduced the number of differentially expressed genes and attenuated the calculated enrichment of cell type-specific pathways between PD and controls. In the PW cohort, this alleviated a substantial false positive signature of oligodendrocyte genes presumably caused by skewed grey/white matter sampling bias. Similar sampling bias could be responsible for oligodendroglia-specific functions appointed to PD brain in previous transcriptomic studies [46].

Intriguingly, accounting for cellular proportions downplayed several of the transcriptomic signatures that have been previously associated with PD. For instance, the signal from vesicle trafficking- and synaptic transmission-related processes [9, 10, 15, 21, 29, 40] was significantly attenuated in both cohorts, suggesting that the signal was primarily driven by changes in neuronal proportions between PD and controls, rather than modulation of these pathways within neurons. Moreover, we observed an attenuation in the down-regulation of mitochondrial pathways, including the respiratory chain and oxidative phosphorylation, which are among the most consistent transcriptomic signatures in PD [8, 9, 19, 20, 29, 42, 55, 57]. The loss of transcriptional signal in these pathways is intriguing, because there is compelling evidence that decreased complex I protein levels occur in PD neurons [23]. Our results suggest that the previously reported transcriptional down-regulation of the respiratory chain is at least partly driven by altered cellular composition (due to decreased number of neurons which highly express these genes) and may therefore not be the sole mechanism by which neuronal complex I deficiency occurs in PD. Indeed, it has been suggested that complex I deficiency in PD may be mediated by proteolytic degradation by the LON-ClpP protease system, rather than transcriptional regulation [44].

Changes in the cell-composition of the affected brain regions occur in all neurodegenerative diseases, including PD, Alzheimer disease, amyotrophic lateral sclerosis (ALS) and Huntington disease. Interestingly, common and overlapping transcriptional signatures have been reported across these neurodegenerative diseases, including mitochondrial, neuronal-specific, and immunity-related pathways [4, 17]. Our findings suggest that these common transcriptional signatures of neurodegeneration may largely represent the common pattern of altered cellularity, involving neuronal loss and glial proliferation, rather than biological processes of causal nature.

Accounting for cell type composition in our samples highlighted processes related to the endoplasmic reticulum, unfolded protein response and lipid/fatty acid oxidation as the top differential gene expression signatures in the PD prefrontal cortex. Unfolded protein response is indeed one of the most consistently reported transcriptomic signatures in PD [8, 9, 20, 28, 41, 55]. Moreover, endoplasmic reticulum stress and aberrant proteostasis have been associated with the accumulation of misfolded proteins, including α-synuclein, in both in vitro studies and animal models of PD [16]. While less is known regarding the role of lipid metabolism in PD, evidence of aberrant fatty acid oxidation has been found by metabolomic studies in serum [12] and urine [36] of patients. Our results corroborate these findings and indicate that aberrant fatty acid metabolism occurs in the PD prefrontal cortex.

Based on our findings, we advocate that modeling cell type heterogeneity is crucial in order to unveil transcriptomic signatures reflecting regulatory changes in the PD brain. It is, however, important noting that modeling of cellular estimates cannot completely mitigate the cell-composition bias in bulk tissue. Moreover, cell type correction complicates the identification of transcriptional changes that are confounded with changes in cellular composition and may thus increase the false negative rate. Single-cell or cell-sorting based methods will be key to overcoming this limitation and deciphering transcriptomic signatures directly associated with underlying disease mechanisms in PD.


Our findings show that differential gene expression signatures derived from bulk brain tissue of PD patients are significantly confounded by underlying differences in cell type composition. Modeling cell type heterogeneity is crucial in order to unveil transcriptomic signatures that represent regulatory changes in the PD brain and are, therefore, more likely to be associated with underlying disease mechanisms.

Availability of data and materials

The datasets supporting the conclusions of this article are included within the article and its supplementary files. The source code for the analyses is available in the GitLab repository ( under the GPL public license v3.0.



Amyotrophic lateral sclerosis


Benjamini-Hochberg false discovery rate


Gene Ontology


Marker gene profiles


Netherlands Brain Bank


Parkinson’s disease


Post-mortem interval


Park West


RNA integrity number


ribosomal RNA


Substantia nigra pars compacta


  1. 1.

    Adiconis X, Borges-Rivera D, Satija R, DeLuca DS, Busby MA, Berlin AM, Sivachenko A, Thompson DA, Wysoker A, Fennell T, Gnirke A, Pochet N, Regev A, Levin JZ (2013) Comparative analysis of RNA sequencing methods for degraded or low-input samples. Nat Methods 10:623–629.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  2. 2.

    Alves G, Muller B, Herlofson K, HogenEsch I, Telstad W, Aarsland D, Tysnes O-B, Larsen JP, for the Norwegian ParkWest study group (2009) Incidence of Parkinson’s disease in Norway: the Norwegian ParkWest study. J Neurol Neurosurg Psychiatry 80:851–857.

    CAS  Article  PubMed  Google Scholar 

  3. 3.

    Andrews S. (2010). FastQC: a quality control tool for high throughput sequence data. Available online at:

  4. 4.

    Arneson D, Zhang Y, Yang X, Narayanan M (2018) Shared mechanisms among neurodegenerative diseases: from genetic factors to gene networks. J Genet 97:795–806

    Article  Google Scholar 

  5. 5.

    Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene ontology: tool for the unification of biology. Nat Genet 25:25–29.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  6. 6.

    Bauer M (2007) RNA in forensic science. Forensic Sci Int Genet 1:69–74.

    CAS  Article  PubMed  Google Scholar 

  7. 7.

    Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  8. 8.

    Borrageiro G, Haylett W, Seedat S, Kuivaniemi H, Bardien S (2018) A review of genome-wide transcriptomics studies in Parkinson’s disease. Eur J Neurosci 47:1–16.

    Article  PubMed  Google Scholar 

  9. 9.

    Bossers K, Meerhoff G, Balesar R, van Dongen JW, Kruse CG, Swaab DF, Verhaagen J (2009) Analysis of gene expression in Parkinson’s disease: possible involvement of neurotrophic support and axon guidance in dopaminergic cell death. Brain Pathol Zurich Switz 19:91–107.

    CAS  Article  Google Scholar 

  10. 10.

    Botta-Orfila T, Tolosa E, Gelpi E, Sànchez-Pla A, Martí M-J, Valldeoriola F, Fernández M, Carmona F, Ezquerra M (2012) Microarray expression analysis in idiopathic and LRRK2-associated Parkinson’s disease. Neurobiol Dis 45:462–468.

    CAS  Article  PubMed  Google Scholar 

  11. 11.

    Broad Institute (2018). Picard tools. Available online at:

  12. 12.

    Burté F, Houghton D, Lowes H, Pyle A, Nesbitt S, Yarnall A, Yu-Wai-Man P, Burn DJ, Santibanez-Koref M, Hudson G (2017) Metabolic profiling of Parkinson’s disease and mild cognitive impairment. Mov Disord Off J Mov Disord Soc 32:927–932.

    Article  Google Scholar 

  13. 13.

    Cantuti-Castelvetri I, Keller-McGandy C, Bouzou B, Asteris G, Clark TW, Frosch MP, Standaert DG (2007) Effects of gender on nigral gene expression and parkinson disease. Neurobiol Dis 26:606–614.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  14. 14.

    Capurro A, Bodea L-G, Schaefer P, Luthi-Carter R, Perreau VM (2014) Computational deconvolution of genome wide expression data from Parkinson’s and Huntington’s disease brain tissues using population-specific expression analysis. Front Neurosci 8:441.

    Article  PubMed  Google Scholar 

  15. 15.

    Chandrasekaran S, Bonchev D (2013) A network view on Parkinson’s disease. Comput Struct Biotechnol J 7:e201304004.

    Article  PubMed  PubMed Central  Google Scholar 

  16. 16.

    Colla E (2019) Linking the endoplasmic reticulum to Parkinson’s disease and alpha-Synucleinopathy. Front Neurosci 13:560.

    Article  PubMed  PubMed Central  Google Scholar 

  17. 17.

    Cooper-Knock J, Kirby J, Ferraiuolo L, Heath PR, Rattray M, Shaw PJ (2012) Gene expression profiling in human neurodegenerative disease. Nat Rev Neurol 8:518–530.

    CAS  Article  PubMed  Google Scholar 

  18. 18.

    Dickson DW (2012) Parkinson’s disease and parkinsonism: neuropathology. Cold Spring Harb Perspect Med 2:a009258–a009258.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  19. 19.

    Duke DC, Moran LB, Kalaitzakis ME, Deprez M, Dexter DT, Pearce RKB, Graeber MB (2006) Transcriptome analysis reveals link between proteasomal and mitochondrial pathways in Parkinson’s disease. Neurogenetics 7:139–148.

    CAS  Article  PubMed  Google Scholar 

  20. 20.

    Dumitriu A, Golji J, Labadorf AT, Gao B, Beach TG, Myers RH, Longo KA, Latourelle JC (2016) Integrative analyses of proteomics and RNA transcriptomics implicate mitochondrial processes, protein folding pathways and GWAS loci in Parkinson disease. BMC Med Genet 9:5.

    CAS  Article  Google Scholar 

  21. 21.

    Edwards YJK, Beecham GW, Scott WK, Khuri S, Bademci G, Tekin D, Martin ER, Jiang Z, Mash DC, ffrench-Mullen J, Pericak-Vance MA, Tsinoremas N, Vance JM (2011) Identifying consensus disease pathways in Parkinson’s disease using an integrative systems biology approach. PLoS One 6:e16917.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  22. 22.

    Elstner M, Morris CM, Heim K, Bender A, Mehta D, Jaros E, Klopstock T, Meitinger T, Turnbull DM, Prokisch H (2011) Expression analysis of dopaminergic neurons in Parkinson’s disease and aging links transcriptional dysregulation of energy metabolism to cell death. Acta Neuropathol (Berl) 122:75–86.

    CAS  Article  Google Scholar 

  23. 23.

    Flønes IH, Fernandez-Vizarra E, Lykouri M, Brakedal B, Skeie GO, Miletic H, Lilleng PK, Alves G, Tysnes O-B, Haugarvoll K, Dölle C, Zeviani M, Tzoulis C (2018) Neuronal complex I deficiency occurs throughout the Parkinson’s disease brain, but is not associated with neurodegeneration or mitochondrial DNA damage. Acta Neuropathol (Berl) 135:409–425.

    Article  Google Scholar 

  24. 24.

    Gaare JJ, Nido GS, Sztromwasser P, Knappskog PM, Dahl O, Lund-Johansen M, Maple-Grødem J, Alves G, Tysnes O-B, Johansson S, Haugarvoll K, Tzoulis C (2018) Rare genetic variation in mitochondrial pathways influences the risk for Parkinson’s disease: mitochondrial pathways in PD. Mov Disord 33:1591–1600.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Gallego Romero I, Pai AA, Tung J, Gilad Y (2014) RNA-seq: impact of RNA degradation on transcript quantification. BMC Biol 12:42.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Gelb DJ, Oliver E, Gilman S (1999) Diagnostic criteria for Parkinson disease. Arch Neurol 56:33–39

    CAS  Article  Google Scholar 

  27. 27.

    Gillis J, Mistry M, Pavlidis P (2010) Gene function analysis in complex data sets using ErmineJ. Nat Protoc 5:1148–1159.

    CAS  Article  PubMed  Google Scholar 

  28. 28.

    Grünblatt E, Mandel S, Jacob-Hirsch J, Zeligson S, Amariglo N, Rechavi G, Li J, Ravid R, Roggendorf W, Riederer P, Youdim MBH (2004) Gene expression profiling of parkinsonian substantia nigra pars compacta; alterations in ubiquitin-proteasome, heat shock protein, iron and oxidative stress regulated proteins, cell adhesion/cellular matrix and vesicle trafficking genes. J Neural Transm Vienna Austria 111:1543–1573.

    CAS  Article  Google Scholar 

  29. 29.

    Hauser MA, Li Y-J, Xu H, Noureddine MA, Shao YS, Gullans SR, Scherzer CR, Jensen RV, McLaurin AC, Gibson JR, Scott BL, Jewett RM, Stenger JE, Schmechel DE, Hulette CM, Vance JM (2005) Expression profiling of substantia nigra in Parkinson disease, progressive supranuclear palsy, and frontotemporal dementia with parkinsonism. Arch Neurol 62:917–921.

    Article  PubMed  Google Scholar 

  30. 30.

    Henderson-Smith A, Corneveaux JJ, De Both M, Cuyugan L, Liang WS, Huentelman M, Adler C, Driver-Dunckley E, Beach TG, Dunckley TL (2016) Next-generation profiling to identify the molecular etiology of Parkinson dementia. Neurol Genet 2:e75.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Huang R, Jaritz M, Guenzl P, Vlatkovic I, Sommer A, Tamir IM, Marks H, Klampfl T, Kralovics R, Stunnenberg HG, Barlow DP, Pauler FM (2011) An RNA-Seq strategy to detect the complete coding and non-coding Transcriptome including full-length imprinted macro ncRNAs. PLoS One 6:e27288.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  32. 32.

    Jaffe AE, Tao R, Norris AL, Kealhofer M, Nellore A, Shin JH, Kim D, Jia Y, Hyde TM, Kleinman JE, Straub RE, Leek JT, Weinberger DR (2017) qSVA framework for RNA quality correction in differential expression analysis. Proc Natl Acad Sci U S A 114:7130–7135.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  33. 33.

    Kelley KW, Nakao-Inoue H, Molofsky AV, Oldham MC (2018) Variation among intact tissue samples reveals the core transcriptional features of human CNS cell classes. Nat Neurosci 21:1171–1184.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  34. 34.

    Kim D, Langmead B, Salzberg SL (2015) HISAT: a fast spliced aligner with low memory requirements. Nat Methods 12:357–360.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Love MI, Huber W, Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15:550.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  36. 36.

    Luan H, Liu L-F, Meng N, Tang Z, Chua K-K, Chen L-L, Song J-X, Mok VCT, Xie L-X, Li M, Cai Z (2015) LC-MS-based urinary metabolite signatures in idiopathic Parkinson’s disease. J Proteome Res 14:467–478.

    CAS  Article  PubMed  Google Scholar 

  37. 37.

    Mancarci BO, Toker L, Tripathy SJ, Li B, Rocco B, Sibille E, Pavlidis P (2017, 2017) Cross-laboratory analysis of brain cell type Transcriptomes with applications to interpretation of bulk tissue data. eNeuro 4.

  38. 38.

    Mancarci O. (2019). Homologene: quick access to homologene and gene annotation updates. R package version 1.4.68. Available online at:

  39. 39.

    Mancarci O. (2019). ermineR: gene set analysis with multifunctionality assessment. R package version 1.0.1. Available online at:

  40. 40.

    Miller RM, Kiser GL, Kaysser-Kranich TM, Lockner RJ, Palaniappan C, Federoff HJ (2006) Robust dysregulation of gene expression in substantia nigra and striatum in Parkinson’s disease. Neurobiol Dis 21:305–313.

    CAS  Article  PubMed  Google Scholar 

  41. 41.

    Moran LB, Duke DC, Deprez M, Dexter DT, Pearce RKB, Graeber MB (2006) Whole genome expression profiling of the medial and lateral substantia nigra in Parkinson’s disease. Neurogenetics 7:1–11.

    CAS  Article  PubMed  Google Scholar 

  42. 42.

    Papapetropoulos S, Ffrench-Mullen J, McCorquodale D, Qin Y, Pablo J, Mash DC (2006) Multiregional gene expression profiling identifies MRPS6 as a possible candidate gene for Parkinson’s disease. Gene Expr 13:205–215

    CAS  Article  Google Scholar 

  43. 43.

    Patro R, Duggal G, Love MI, Irizarry RA, Kingsford C (2017) Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods 14:417–419.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  44. 44.

    Pryde KR, Taanman JW, Schapira AH (2016) A LON-ClpP Proteolytic Axis degrades complex I to extinguish ROS production in depolarized mitochondria. Cell Rep 17:2522–2531.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  45. 45.

    de Rijk MC, Launer LJ, Berger K, Breteler MM, Dartigues JF, Baldereschi M, Fratiglioni L, Lobo A, Martinez-Lage J, Trenkwalder C, Hofman A (2000) Prevalence of Parkinson’s disease in Europe: a collaborative study of population-based cohorts. Neurologic Diseases in the Elderly Research Group. Neurology 54:S21–S23

    PubMed  Google Scholar 

  46. 46.

    Riley BE, Gardai SJ, Emig-Agius D, Bessarabova M, Ivliev AE, Schüle B, Schüle B, Alexander J, Wallace W, Halliday GM, Langston JW, Braxton S, Yednock T, Shaler T, Johnston JA (2014) Systems-based analyses of brain regions functionally impacted in Parkinson’s disease reveals underlying causal mechanisms. PLoS One 9:e102909.

    Article  PubMed  PubMed Central  Google Scholar 

  47. 47.

    Schuierer S, Carbone W, Knehr J, Petitjean V, Fernandez A, Sultan M, Roma G (2017) A comprehensive assessment of RNA-seq protocols for degraded and low-quantity samples. BMC Genomics 18:442.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  48. 48.

    Simunovic F, Yi M, Wang Y, Stephens R, Sonntag KC (2010) Evidence for gender-specific transcriptional profiles of nigral dopamine neurons in Parkinson disease. PLoS One 5:e8856.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  49. 49.

    Soneson C, Love MI, Robinson MD (2016) Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Res 4:1521.

    Article  PubMed Central  Google Scholar 

  50. 50.

    Srinivasan K, Friedman BA, Larson JL, Lauffer BE, Goldstein LD, Appling LL, Borneo J, Poon C, Ho T, Cai F, Steiner P, van der Brug MP, Modrusan Z, Kaminker JS, Hansen DV (2016) Untangling the brain’s neuroinflammatory and neurodegenerative transcriptional responses. Nat Commun 7:11295.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  51. 51.

    Stamper C, Siegel A, Liang WS, Pearson JV, Stephan DA, Shill H, Connor D, Caviness JN, Sabbagh M, Beach TG, Adler CH, Dunckley T (2008) Neuronal gene expression correlates of Parkinson’s disease with dementia. Mov Disord Off J Mov Disord Soc 23:1588–1595.

    Article  Google Scholar 

  52. 52.

    Toker L, Mancarci BO, Tripathy S, Pavlidis P (2018) Transcriptomic evidence for alterations in astrocytes and Parvalbumin interneurons in subjects with bipolar disorder and schizophrenia. Biol Psychiatry 84:787–796.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  53. 53.

    Velmeshev D, Schirmer L, Jung D, Haeussler M, Perez Y, Mayer S, Bhaduri A, Goyal N, Rowitch DH, Kriegstein AR (2019) Single-cell genomics identifies cell type-specific molecular changes in autism. Science 364:685–689.

    CAS  Article  PubMed  Google Scholar 

  54. 54.

    Ward CD, Gibb WR (1990) Research diagnostic criteria for Parkinson’s disease. Adv Neurol 53:245–249

    CAS  PubMed  Google Scholar 

  55. 55.

    Zhang Y, James M, Middleton FA, Davis RL (2005) Transcriptional analysis of multiple brain regions in Parkinson’s disease supports the involvement of specific protein processing, energy metabolism, and signaling pathways, and suggests novel disease mechanisms. Am J Med Genet B Neuropsychiatr Genet 137B:5–16.

    Article  PubMed  Google Scholar 

  56. 56.

    Zhao W, He X, Hoadley KA, Parker JS, Hayes D, Perou CM (2014) Comparison of RNA-Seq by poly (A) capture, ribosomal RNA depletion, and DNA microarray for expression profiling. BMC Genomics 15:419.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  57. 57.

    Zheng B, Liao Z, Locascio JJ, Lesniak KA, Roderick SS, Watt ML, Eklund AC, Zhang-James Y, Kim PD, Hauser MA, Grünblatt E, Moran LB, Mandel SA, Riederer P, Miller RM, Federoff HJ, Wüllner U, Papapetropoulos S, Youdim MB, Cantuti-Castelvetri I, Young AB, Vance JM, Davis RL, Hedreen JC, Adler CH, Beach TG, Graeber MB, Middleton FA, Rochet J-C, Scherzer CR, Global PD Gene Expression (GPEX) Consortium (2010) PGC-1α, a potential therapeutic target for early intervention in Parkinson’s disease. Sci Transl Med 2:52ra73.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

Download references


Not applicable.


This work is supported by grants from The Research Council of Norway (288164, ES633272) and Bergen Research Foundation (BFS2017REK05).

Author information




GSN participated in the study conception, preprocessed the RNA-seq data, designed and performed the RNA-seq computational analyses. FD participated in the RNA-seq computational analyses. KH contributed to data interpretation and critical revision of the manuscript. IJ and KP participated in the conception and design of the study, data interpretation and critical revision of the manuscript. OBT and GA contributed part of the tissue material, supplementary data, and critical revision of the manuscript. CT conceived, designed and directed the study, contributed to data analyses and interpretation and acquired funding for the study. GSN and CT wrote the manuscript with the active input and participation of LT, KH, IJ, and KP All authors have read and approved the manuscript.

Corresponding author

Correspondence to Charalampos Tzoulis.

Ethics declarations

Ethics approval and consent to participate

Informed consent was available from all individuals. Ethical permission for these studies was obtained from our regional ethics committee (REK 2017/2082, 2010/1700, 131.04).

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1:.

Cohort demographic and experimental information.

Additional file 2: Figure S1.

Read mapping efficiency; Figure S2. Read mapping statistics; Figure S3. Sample clustering; Figure S4. Neuronal MGPs and expression of neuronal markers; Figure S5. Oligodendrocyte MGPs and expression of oligodendrocyte markers; Figure S6. Microglial MGPs and expression of microglial markers; Figure S7. Astrocyte MGPs and expression of astrocyte markers; Figure S8. Endothelial MGPs and expression of endothelial markers; Figure S9. Neuroexpresso neuronal markers and overlap with single-cell neuronal DEGs; Figure S10. Cellular estimates grouped by status; Table S1. Correlations between MGPs calculated on different marker sets.

Additional file 3.

Significant genes overlapping PW and NBB accounting for cell types.

Additional file 4.

Complete results of the differential expression analyses.

Additional file 5.

ErmineR pathway enrichment analyses before and after accounting for cell composition.

Additional file 6

Up- and down-regulated pathways ranked by the change in p-value resulting from accounting for cell composition.

Additional file 7.

Read count matrix.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Nido, G.S., Dick, F., Toker, L. et al. Common gene expression signatures in Parkinson’s disease are driven by changes in cell composition. acta neuropathol commun 8, 55 (2020).

Download citation


  • RNA sequencing
  • Neurodegeneration
  • Parkinsonism
  • Mitochondria
  • Gene expression