In this study, we examined the role of epigenome-wide DNA methylation profiles in DCI occurrence and recovery outcomes among aSAH patients. None of the CpG sites passed the EWAS significance threshold of 2.4 × 10−7. The top hit at cg18031596 had a small p-value of 2.3 × 10−6 and was annotated to ANGPT1, which is a key player in angiogenesis after vascular injury. Despite having a small cohort, at cg18031596 we estimated a 94.6% power to detect a true mean difference as larger or larger than the observed mean difference at a significance threshold of p-value =1 × 10−7. With an intention not to miss a potentially true signal with biological relevance, we conducted targeted bisulfite sequencing of ANGPT1 in a larger follow-up patient cohort. Although four out of the five CpG sites sequenced were significantly associated at a Bonferroni adjusted threshold, their effects were in the opposite direction compared to the EWAS results. Our mixed results indicate that more research is needed to determine the relevance of ANGPT1 for the occurrence of DCI.
The top EWAS CpG site, cg18031596, is located immediately upstream the transcription start site of ANGPT1 (Fig. 2C). ANGPT1 encodes angiopoietin-1 (ANG-1), which belongs to the angiopoietin family and plays important roles in vascular development and angiogenesis. Studies have shown that by acting through Tie2, an endothelial-specific tyrosine kinase receptor, ANG-1 mediates the endothelium-surrounding matrix interactions, maintains the integrity of the vascular endothelium, and inhibits endothelial permeability to protect blood vessels from leaking in vivo [19,20,21]. ANG-1 was also known as an anti-inflammatory factor, and inflammation was actually one of the proposed processes involved in the development of DCI [8, 9, 22]. Mice overexpressing Ang-1 displayed greater resistance to leaks caused by inflammatory agents and a reduced ischemic lesion volume after embolic middle cerebral artery occlusion [23, 24]. Several preclinical studies have shown a therapeutic effects of ANG-1 in alleviating the consequence of ischemia and stroke [25,26,27]. In a previous longitudinal study, decreased serum levels of ANG-1 early after admission due to aSAH was found in patients who suffered DCI later, and these patients experienced a delayed increase of ANG-1 compared with DCI controls [28]. Polymorphisms in the mice ortholog of ANGPT1 were reported as genome-wide significant hits in a study of mice using cerebral artery occlusion model, and human population studies have reported a few genetic variants of ANGPT1 associated with ischemic and hemorrhagic stroke [29, 30]. As the relationship between DNA methylation and gene expression is tissue-specific and dynamic, it is so far not known if the methylation status of cg18031596 indeed regulates the expression level of ANGPT1 in DCI occurrence. By querying a public dataset containing methylation and mRNA data, we found that there was a correlation between cg18031596 and ANGPT1 expression in several tissues (TCGA Wanderer dataset, http://maplab.imppc.org/wanderer/) (Fig S6). Future studies are needed to establish the regulatory effect of methylation at cg18031596 on the expression of ANGPT1.
We note that we were not able to rule out the potential confounding effect from factors known to influence methylation level, such as existing cerebrovascular and cardiovascular conditions of the patients, certain genetic variants and environmental stimuli [31, 32]. The lack of the genotype data of the studied patients prevented us from identifying and controlling for methylation quantitative trait loci (meQTL) as common causes for DCI and methylation. As the current knowledge on DCI etiology is far from complete, factors that have not been studied and recognized in the context of DCI also have the potential to confound our results in addition to known DCI risk factors. Replications in larger independent cohorts with more complete data are in need to further follow-up our findings.
Although the post hoc power calculation for the top EWAS hit cg18031596 indicated an adequately powered discovery analysis, the failure in replicating this signal suggests that it was likely a random chance that ANGPT1 ranked top in the EWAS and this did not represent a true signal. In addition to this explanation, we performed several examinations to explore other possibilities for the discordance in effect directions. Some typical reasons for a flip in association directions include differences in methylation correlation patterns across populations, a lack of power in the smaller study, sampling variation, and different genomic and/or environmental context. Our discovery and replication samples were largely of the same ethnic origins (Table 1) so we did not expect them to differ much in methylation correlation patterns. Other possibilities were discussed in the following paragraphs.
First, this inconsistency may be attributed to a complex interplay among disease-influencing factors coupled by differential distribution of these factors in the discovery and replication samples. We sought to investigate this possibility by examining available demographic and clinical variables of the participants. We did not observe significant differences in the distribution of demographics (age, sex, race, height, weight) nor the clinical characteristics (Fisher score, recovery outcomes) between the discovery and the replication groups (Table 1). We noticed remarkably high smoking rates among the participants from both groups, in line with smoking being a risk factor for aSAH. Although not statistically significant, the replication group had more smokers than the discovery group did (Table 1, 68% vs 57%, fisher exact test p = 0.14). Smoking is a well-established trigger of methylation signature alteration [33], and there have been studies reporting an alteration of ANGPT1 methylation or ANG-1 protein levels in samples from smokers compared with non-smokers [34, 35]. We explored whether smoking was able to complicate the analysis and contributed to the effect direction flip, and observed a complicated influence of smoking as some CpGs showed a flip in direction only in smokers while others had a flip in direction only in non-smokers (Fig S5). However, given the small group sizes, we were not able to make any conclusion. Although we used surrogate variables in the EWAS to control for unwanted variations not explicitly adjusted as a covariate, such as smoking status, it is possible that smoking status may have complicated our results in a way that was not easy to unmask through the inspections we could perform given the amount of data available. We found that the effect direction discrepancy could not be explained by race, time of DNA collection, or Fisher score. It is possible that complex interactions between either measured or unmeasured factors are in part responsible for the reversed directions, even though we were not able to statistically detect the effect.
One major difference between the discovery and replication analyses was the technology used for methylation profiling. Although ANGPT1 methylation levels quantified by 450K array and MethylSeq showed medium to high correlation (ρ = 0.64~0.86, Fig. 2), there was certainly some degree of inconsistency. Indeed, the M values were far from being identical. This difference in methylation measurements may contribute to the flip in effect direction in the first place.
We then speculated that the adjustment for surrogate variables in the discovery EWAS and the inability of doing so in the replication cohort may give rise to effect estimates of opposite directions. The results from the proxy surrogate variable analyses support that this could account for the sign flip at four of the five ANGPT1 CpGs but not at cg18031596 (Table S5).
Lastly, we explored whether the discrepancies in effect directions of cg18031596 could be explained by chance, provided that the association is not a false positive. Considering the entire sample, if the discovery EWAS by chance oversampled from the upper tail of the methylation distribution of DCI cases and the lower tail of the methylation distribution of DCI controls, then the remaining samples left for the replication analysis would give the impression of an effect in the opposite direction. This explanation was supported by our simulation experiments where the chance of such oversampling was 8 in 1229. Sampling variation like this is unlikely but not impossible, and once this has happened, an opposite effect estimate is highly likely to follow (100% of the time in our simulation). A sample size as small as a few dozen added an extra possibility of large statistical fluctuations. It is also possible that the observed opposite effects were the results of a combination of multiple explanations discussed above.
We found some evidence of associations between ANGPT1 methylation and patient recovery at 3 and 12 months post hemorrhage, although this does not replicate in the follow-up cohort (where we cannot adjust for surrogate variables). This is consistent with a previous study demonstrating that a high concentration of serum ANG-1 predicts better 3-month post-hemorrhage GOS [36]. This relatively long-term effect of ANGPT1 methylation may be mediated through its effect on DCI risk, since patients displaying higher level of methylation tend not to suffer DCI and therefore recover better. The relationship may also be independent of or only partially mediated by DCI risk, which will need to be examined in future studies.
This study has several strengths. Cross-sectional EWAS is often criticized for the possibility of reverse causality. Although methylation can both influence and be influenced by health conditions, the longitudinal design of our study naturally established the proper temporality. The use of surrogate variables minimized confounding from technical artifacts and cell type heterogeneity. We also note several limitations that should be taken into consideration for result interpretation. First, it is still under debate that if blood is a relevant tissue for studying brain conditions. For a study of DCI, there is some evidence suggesting its usefulness. Immune cell populations are likely to reflect biological changes ensuing inflammatory response, which has been shown to correlate with DCI events [8, 9]. Second, the targeted replication assay limited our ability to adjust for cell type heterogeneity in the replication analysis, which, as discussed above, may contribute to the inconsistent effects of ANGPT1 methylation in the discovery and the replication analysis. Third, although we excluded any samples collected after the first 48 h post hemorrhage, there may still be some level of heterogeneity among samples within the 48-h period. Fourth, sample size is a recurrent challenge in studies of diseases with low incidence and high mortality, and we were not able to seek external replication cohorts with DNA methylation data collected. One prominent consequence of the small sample size is that the statistical model may not sustain an adjustment of all necessary covariates, and this prevented us from adjusting for factors that are thought to be clinically important for DCI, such as Fisher scale and smoking status. Nonetheless, we applied surrogate variable analysis which is designed to remove the effect of any uncontrolled variables. Lastly, there were some lost to follow-up at 3 months and 12 months post injury, which may affect our analysis of recovery outcomes at those time points if the drop-out was not random.