Clinical-molecular characteristics of primary-relapse medulloblastoma cohort
Forty-three primary and relapse tumor samples identified via DNA methylation profiling as MB were selected for the investigation of their paired transcriptome profiles. The cohort included 24 SHH-MB, 5 Group 3 MB and 14 Group 4 MB pairs (Fig. 1a, Additional File 2: Table S1). WNT-MB was excluded from this study due to the low number of primary-relapse sample transcriptome profiles (n = 1) available in our DKFZ cohort.
Evaluation of the radiological images (see “Methods” Section, Additional File 2: Table S1) identified anatomic patterns of relapsed MB as follows: (1) isolated local relapses (12/43 [28%] MB; SHH-MB/Group 3 MB/Group 4 MB distribution 12/0/0; mean age—17 years; male/female ratio 57%/43%; M2-3 at diagnosis—10%; median PFS—64 months); (2) distant/metastatic relapses (15/43 [35%] MB; SHH-MB/Group 3 MB/Group 4 MB distribution 4/4/7; mean age—17 years; male/female ratio 70/30%; M2-3 at diagnosis—30%; median PFS—30 months) and (3) combined local/distant relapses (16/43 [37%] MB; SHH-MB/Group 3 MB/Group 4 MB distribution 8/2/6; mean age—13 years; male/female ratio 70%/30%; M2-3 at diagnosis—40%; median PFS—30 months). Histopathological evaluation revealed that 32/73% MB pairs had consistent tumor histology at primary diagnosis and at relapse (among them—10 classic MB; 11 desmoplastic/nodular MB (DNMB), and 11 large-cell/anaplastic MB (LCA)), while 12/27% MB primaries disclosed histological diversity at the relapse when 3/13 classic MB (23%) and 9/20 DNMB (45%) were diagnosed as LCA at relapse. There was no preponderance of anatomic relapse patterns for tumors with and without histological divergence.
Medulloblastoma molecular groups and subgroups were conserved at relapse in 41 (95%) cases and only two Group 4 MB primary tumors demonstrated a switch to Group 3 MB at relapse, confirming a rarity of progression-associated MB group change that was observed previously in ~ 5% of Group 3/4 MB recurrences [14]. In these two samples, second-generation subgroups also switched from VII to V in one pair and from VIII to V in another one, respectively.
Somatic changes were identified in MB relapses with targeted exome sequencing and DNA methylation profiling (see “Methods” Section; Additional File 2: Table S1). Thus, 3 relapsed MB (7%) demonstrated variance in driver mutations whereas 20 cases (43%) disclosed differences in copy number changes. For example, TP53 somatic mutations were identified as relapse-specific in two infant SHH TP53-wt tumors (both harboring PTCH1 germline mutations). Relapse-specific amplification of MYCN was identified from methylation and FISH data (see Methods) in one SHH-MB and one Group 4 MB, similarly one SHH-MB case demonstrated GLI2 amplification only at relapse. Additional chromosomal gains and losses were identified specifically at relapse in 13/24 SHH-MB (55%), 1/5 Group 3 MB (20%) and 8/14 Group 4 MB (55%). Notably, prototypic chromosomal events such as 9q loss in SHH-MB and isochromosome 17q in Group 3 and Group 4 MB were conserved between diagnosis of primary and relapse in all affected cases.
After the concordance between primary and relapse RNA-sequencing profiles was verified based on the fingerprint SNV match (Additional File 1: Fig. S1a), we focused on the transcriptome variance analysis between primary and relapsed tumors. The gene expression data clearly demonstrated group specificity from unsupervised Principal Component (PC) analysis (Fig. 1b). The variance between primary and relapse cases was measured as Euclidian distance between top PC components (Fig. 1c), which revealed that this effect is quite strong, especially in Groups 3 and 4 MB. Notably, the PC variance in methylation data appeared to be much lower in these MB groups in comparison to gene expression (Additional File 1: Fig. S1b, c), probably driven by strong stability of this epigenetic effect [5].
SHH-MB tumors demonstrate strong changes in relapse transcriptome profiles at young age
As a next step, we focused on a more precise investigation of molecular groups starting from SHH-MB. The variance remained quite stable from the PC inspection within the SHH-MB cohort confirming its’ group-specificity (Fig. 2a). Interestingly, an evident negative correlation (p-value = 0.044) was identified between the variance of expression profiles and the age of patients (Fig. 2b). From further detailed inspection, we found that the transcriptome variances were also significantly higher for tumors with new CNVs (n = 11) as compared to those without them (n = 13) (Fig. 2c). Notably, CNV profiles occurring in relapsed SHH-MB in younger than 10 years (n = 8) demonstrate clear differences from their predominantly balanced primaries (Additional File 1: Fig. S2a) with frequent involvement of the TP53 locus at 17p (Fig. 2d). In contrast, in SHH-MB older than 10 years (n = 16), CNV profiles typically remain stable at relapse (Additional File 1: Fig. S2b). We also identified that primary-relapse expression variance was significantly higher for combined relapses (n = 8) as compared to local (n = 12) and distant (n = 4) relapses of SHH-MB (Fig. 2e).
Further, we analyzed if there were evident differentially expressed genes (DEGs) between primary and relapse in SHH-MB. The batch-effect adjusted analysis resulted in 484 genes whose expression was highly active in relapses and 184 genes whose expression was significantly lower in relapses (Additional File 2: Table S2). Gene ontology (GO) analysis demonstrated that genes over-expressed in relapsed SHH-MB were associated with molecular processes such as N-Glycan biosynthesis, ECM-receptor interaction, and ribosome biogenesis pathways, while genes under-represented in relapses were connected to MAPK, HIF1 and neurotrophin signaling (Additional File 2: Table S3).
Next, we analyzed the clinical relevance for top DEGs overexpressed in primary (top 20) and relapsed tumors (top 20) within two extended cohorts of primary SHH-MB included in international tumor sets (see Methods): relapsed (n = 98) and entire (n = 188) (Additional File 1: Fig. S2c). However, univariate survival analysis disclosed no associations of expression of any of these DEGs with clinical outcomes for both these SHH-MB cohorts. We also inspected if there are any effects associated with the age by separately calling DEGs for infant and older patient groups, but no associations to clinical outcomes were found.
Novel potential gene markers identified for relapse of group 3/4
We further performed a similar analysis of primary and relapse transcriptome profile differences for Groups 3 and 4 MB, merged as non-SHH/non-WNT MB, termed further as Group 3/4 MB cohort (Fig. 3a). In contrast to SHH-MB, these tumor variants disclosed no clear transcriptome variances associated with the patients’ age (Fig. 3b) or tumor relapse patterns of metastatic (n = 11) versus combined (n = 8) (Fig. 3c).
Even though some acquired CNV changes were observed in metastatic Group 3/4 MB relapses, the DNA profiles remained mostly stable for primary-relapse pairs (Additional File 1: Fig. S3a, b). In addition, the variance between primary and relapse transcriptome profiles remained quite stable in this context too, and was not associated with novel CNVs (Additional File 1: Fig. S3c).
Afterwards, we focused on the differences between primary and relapse profiles by identification of DEGs (Additional File 2: Table S4). The total amount of Group 3/4 MB DEGs (n = 585 higher expressed, n = 1526 lower expressed at relapse) was approximately three times higher in comparison to SHH-MB and more possible GO associations were identified for these gene sets (Additional File 2: Table S5). The genes overexpressed in relapsed Group 3/4 MB were found to be involved in several molecular pathways, including cell cycle activation, phagosome biogenesis, focal adhesion, and others. In contrast, genes low expressed in relapsed tumors were contributing to protein digestion, retinol metabolism, and chemical carcinogenesis.
Next, we analyzed the clinical relevance for top DEG overexpressed in primary (top 20) and relapsed (top 20) tumors within two extended cohorts of primary Group 3/4 MB included in international tumor sets (see Methods): relapsed (n = 196) and entire (n = 435) (Fig. 3d). The univariate survival analysis showed that high levels of several top evident genes overexpressed in relapsed tumors such as PDIA6 (Figs. 3e, 4a, b), MRPL32 (Additional File 1: Fig. S4c, d) or FKBP9 (Additional File 1: Fig. S4c, d) were associated with unfavorable PFS and OS in both relapsed and the entire Group 3/4 MB cohorts. In contrast, high expression of top genes down-regulated in relapsed Group 3/4 MB such as SNORD115-23 (Figs. 3f, 4c, d), TMEM261P1 (Additional File 1: Fig. S4e, f) or RIT2 (Additional File 1: Fig. S4g, h) were associated closely with favorable outcomes for two these cohorts of Group 3/4 MB. Moreover, these results were confirmed from analysis of the external MB transcriptome dataset [2], where the same survival patterns were observed in the entire Group 3/4 MB cohort (n = 377) for PDIA6 (Additional File 1: Fig. S5a), FKBP9 (Additional File 1: Fig. S5b), SNORD115-23 (Additional File 1: Fig. S5c) and RIT2 (Additional File 1: Fig. S5d) genes.
In addition, performing differential gene expression analysis on Group 3 MB and Group 4 MB primary-relapse cohorts separately verified the marked genes independently, even though less statistical evidence was observed for Group 3 MB, probably due lower number of primary-relapse pairs in this cohort (n = 5).
Finally, we performed multigene survival analysis in relapsed Group 3/4 MB cohort (see Methods) focusing on top 20 genes differentially expressed between primary and relapsed tumors respectively (Fig. 3d). Thus, we identified that 9 genes overexpressed in relapses were significantly associated with unfavorable OS in relapsed cohort whereas 10 genes specific for primary Group 3/4 MB were associated with favorable outcomes. We combined these clinically relevant genes in two metagene cohorts—unfavorable and favorable respectively which, in turn, were correlated closely with OS of relapsed Group 3/4 MB (Additional File 1: Fig. S5 e, f).
Cox regression analysis combining clinical and molecular variables in Group 3/4 MB revealed an independent prognostic significance of outlined unfavorable metagene set together with advanced M stages and MYC amplification (Additional File 2: Table S6). Interestingly, molecular MB groups did not reach an independent level in this multivariate model.
Deconvolution of bulk profiles uncovers functional structure variances between primary and relapse tumors
Single-cell sequencing techniques allow to understand the tumor cell composition and available single-cell profiles can serve as a reference control for analysis of bulk transcriptome profiles with deconvolution. To integrate this technique, we used as the reference a recently published single-cell profile dataset from MB [20] with detailed annotation of cell composition within them. In particular, the MB tumor cell types were distinguished into three main groups with each of them split into two subtypes: (A1; A2) cell cycle activity enriched, (B1; B2) undifferentiated progenitors and (C1; C2) differentiated neuronal-like cells. Application of the CIBERSORTx deconvolution software tool (see “Methods” section) to bulk primary and relapse transcriptome profiles demonstrated the enrichment of each cell group and subtype in a tumor at diagnosis and relapse respectively (Fig. 5a). We then analyzed group-specific differences in these compositions between primary and relapse cases. Groups 3 and 4 MB were processed separately since the reference single-cell gene profiles were provided in this format [20]. SHH-MB did not show any evident differences in cell cycle subtypes between primary and relapse samples (Additional File 1: Fig. S6a). However, the proportion of the undifferentiating progenitors appeared to be clearly increased (Fig. 5b; T-test p-value: 0.01), while formed differentiated neuron-like population decreased (Fig. 5c; T-test p-value: 0.004) in SHH-MB relapses. Notably, this effect was more evident in adult SHH-MB relapses (Additional File 1: Fig. S6b; T-test p-values: 0.01 in adults, 0.17 in infants).
The increased proportion of activated progenitors in SHH-MB relapses was in concordance with the GO enrichment of N-Glycan biosynthesis and ribosomal biogenesis processes, that were identified in relapse-associated DEG sets (Additional File 2: Table S3). Moreover, the effect was also inspected using gene set variance analysis (GSVA) procedure: the enrichment patterns of the corresponding MB SHH cell types in primary-relapse bulk profiles fully reflected the deconvolution results (Additional File 1: Fig. S5c).
In MB Group 4 the deconvolution was showing approximately a 1.5 times increase in cell cycle activity proportion subtype in relapses (Fig. 5d; T-test p-value 0.05), as was also confirmed by GO analysis (Additional File 2: Table S5). Respectively, the proportion of differentiated neuronal-like cells in the relapses vise-a-verse decreased (Fig. 5d; T-test p-value: 0.01), while undifferentiating progenitors did not demonstrate any evident changes between primary and relapse tumors (Additional File 1: Fig. S6d; T-test p-value: 0.34). In Group 3 MB the increase of cell cycle and loss of differentiated neuronal cell proportions were reflecting patterns similar to Group 4 MB (Additional File 1: Fig. S6e–g), however, the statistical evidence of the difference between primary and relapse proportion results did not pass filtering limits (p-value < 0.05), most likely due to low amount of input sample pairs (n = 5). Nevertheless, the results were also verified via GSVA applied for Group 4 cell type markers in primary-relapse bulk profiles (Additional File 1: Fig. S5h).