Study sample
This investigation was done within the CoDuBu study, which aimed to understand the co-occurrence of common infections and NCDs in the Taabo health and demographic surveillance system (HDSS) in south-central Côte d'Ivoire [37]. The detailed CoDuBu protocol is published elsewhere [38]. Concisely, the study began in 2017 and included 1019 adults selected at random, from three purposively selected HDSS areas of varying urbanization. Participants underwent detailed health examinations including anthropometry (body height, weight and waist circumference) and subclinical cardio-metabolic phenotyping, including blood pressure, fasting glucose level and blood lipid profile. Dried blood spots were prepared on Whatman FTA cards, air-dried at room temperature, sealed in desiccant containing pouches, according to manufacturer’s instructions, and stored at − 80° C in a biobank. Participants underwent interviews assessing their sociodemographic, behavioural, lifestyle and environmental characteristics, as well as health status and healthcare use, among others. The CoDuBu study was approved by the Côte d’Ivoire National Ethics Committee for Life and Health Sciences (ref. no. 032/IMSHP/CNER-kp; date of approval: March 24, 2017) and the Ethics Committee of North-West and Central Switzerland (ref. no. 2016-00143; date of approval: May 2, 2016). All participants provided written informed consent before participation in the study [38].
Measurement of DNAm
DNA was extracted from ~ 280 mm2 (10 × 6 mm punch) blood spot per subject for 400 FTA cards using standard salting out procedure based method [39]. The EZ 96-DNA methylation kit (Zymo Research; Irvine, CA, USA) was used for bisulfite-conversion of DNA. Epigenome-wide DNA methylation was measured using the Illumina Infinium MethylationEPIC BeadChip (Illumina, Inc.; San Diego, CA, USA) that covered 866,091 probes. Samples were randomly distributed on arrays to minimize batch effects. Each batch had multiple identical control samples to assess assay variability. Dye-bias correction [40] and absolute methylation level (β values, defined as the ratio of methylation intensity over total intensity, with offset of 100, were computed using the minfi R-package (R Development Core Team) [41]. Quality control excluded 24,138 probes (detection p value > 10−16) and seven samples (call rate < 95% [n = 5] and sex mismatch [n = 2]). We applied beta mixture quantile normalization of the β values to correct for the Illumina probe design bias [42]. We included 841,953 CpG sites from 393 participants in subsequent determination of epigenetic age.
Estimation of epigenetic age
We estimated the parameters of participants’ epigenetic age (and corresponding measures of age acceleration) from the new DNAm age calculator [9] where the participants’ DNAm beta values and annotation files were used as input files. HannumAge, HorvathAge, PhenoAge and GrimAge were estimated using the 71, 353, 513 and 1030 CpGs reported in their discovery studies, respectively [9,10,11,12]. Leukocyte proportions (B cells, CD4+ T, CD8+ T, granulocytes, monocytes and natural killer cells) were estimated using the Houseman’s method implemented in the online calculator [41, 43].
EAA was calculated for each clock as the residuals of linear regression of epigenetic age on chronological age [9, 16]. Unlike absolute difference measures, these residuals are robust to measurement platforms and normalization methods, with improved comparability across studies [44]. Specifically, IEAA is derived from regressing HorvathAge on chronological age and leukocyte composition, hence its intrinsic property. EEAA is derived from regressing a weighted average of HannumAge and age-varying leukocyte components (naïve cytotoxic T cells, exhausted cytotoxic T cells and plasmablasts) on chronological age [45]. This weighted average measure (and therefore EEAA) captures both age-related changes in leukocyte composition and intrinsic aging. PhenoAA and GrimAA are derived from regressing PhenoAge and GrimAge on chronological age, respectively [11, 12]. Given that GrimAge comprises eight DNAm surrogates of plasma proteins (adrenomedullin, B2M, GDF-15, Cystatin C, leptin, PAI-1 and TIMP-1) and pack years, we derived their chronological age-adjusted equivalents for further investigation into GrimAA. The epigenetic clocks of our sample were based on the EPIC array, which does not cover six CpGs of HannumAge and 19 CpGs of HorvathAge. Nevertheless, recent comparative studies showed EPIC-based clocks to exhibit comparable performance to their discovery arrays [44, 46].
Measurement of risk factors
We used a questionnaire to determine participants’ chronological age (years), sex (male or female), formal educational attainment (none, primary, secondary or tertiary), and residence (Amani-Ménou, Taabo-Cité or Tokohiri). Taabo-Cité is an urban area relative to Amani-Menou and Tokohiri, which are more rural. Wealth or asset index was estimated for each participant—in the context of the HDSS—by applying principal component analysis to property or possessions, and housing characteristics of the participants’ households [37]. Wealth index is a reliable and stable proxy for consumption and therefore economic status in general terms [47, 48].
Lifestyle factors included smoking status, alcohol consumption and physical activity. Smoking status was determined as never-smoker (lifetime of nonsmoking), former smoker (smoked in the past but quit) and current smoker (presently smoked tobacco products). Alcohol consumption was determined using the AUDIT-C (Alcohol Use Disorder Identification Test-Consumption) questionnaire, which scores alcohol consumption and frequency (0–12) to identify risky consumption. Participants were classified as low risk (AUDIT-C score 0–3 for males and 0–2 for females), medium risk (4–9 for males and 3–9 for females), and high risk (10–12) [49]. Physical activity was measured as self-reported number of min per week of engagement in at least 10 min of moderate and vigorous activities covering transport, leisure and occupational activities [50]. Body weight (kg), height (cm) and waist circumference (cm) were measured to the nearest 0.1 unit. BMI was calculated as the ratio of body weight and height-squared (kg·m−2), and participants were classified into underweight (< 18.5), normal weight (18.5–24.9), overweight (25–29.9) and obese (≥ 30).
Measurement and definition of metabolic syndrome phenotypes
Central obesity was defined as waist circumference ≥ 94 cm in males and ≥ 80 cm in females. Blood pressure was measured three times, on the left arm, in a sitting position, and the mean of the last two measures noted. Raised blood pressure was defined as mean blood pressure ≥ 135/80 mmHg or use of blood pressure-lowering medication. Fasting glucose and lipid profile were measured using the point-of-care Alere AS100 system (and corresponding cartridges from same production batch), which exhibited good performance in tropical settings [36, 51, 52]. Impaired fasting glucose was defined as having fasting glucose ≥ 5.6 or use of glucose-lowering medication. Low HDL was defined as < 1.0 mmol·L−1 and < 1.3 mmol·L−1 in males and females, respectively, whereas high triglycerides were defined as ≥ 1.7 mmol·L−1. We defined MetS severity as an additive score of five components where a score of zero or five indicates absence or presence of all five components, respectively. We also defined presence and absence of MetS as scoring ≥ 3 and < 3 on the MetS severity scale [53].
Statistical analysis
Descriptive statistics
Statistical analyses were performed in Stata version 16 (Stata Corporation; College Station, TX, USA). We examined the EAA parameters for extreme outliers defined as observations greater than three interquartile range beyond the interquartile range of each parameter. We summarized the characteristics of included participants, using means and SD for continuous variables, and frequencies for categorical variables.
To evaluate the performance of the epigenetic clocks, we calculated the EAD (absolute difference between epigenetic and chronological age) and the Bland-Altman’s 95% limits of agreement [54]. We also tested Lin’s concordance correlations between epigenetic and chronological age. Lin’s concordance correlation coefficient combines the tightness of the observations to the line of best fit, and the nearness of the line of best fit to the identity line (where Y = X) of perfect concordance [55, 56]. We tested Pearson’s correlations between chronological age and (i) epigenetic age; (ii) EAD; and (iii) EAA. Finally, we tested the robustness of these performance indicators limited to apparently healthy participants (i.e., non-smokers with no MetS feature).
Associations between sociodemographic and lifestyle factors, and EAA
We performed hierarchical multivariable regressions to assess the association between sociodemographic and lifestyle factors and each of the four age acceleration parameters. Here, we estimated the minimally biased effect size for each block of covariates by excluding as much as possible, the potential mediators of the variables in the covariate block. As shown in Fig. 1, we estimated the effects of four cumulative blocks of covariates including:
-
i.
Model 1: chronological age and sex;
-
ii.
Model 2: model 1 + educational level, household wealth index and urbanization (socioeconomic);
-
iii.
Model 3: model 2 + smoking, alcohol consumption and physical activity (lifestyle) and
-
iv.
Model 4: model 3 + BMI (secondary lifestyle).
We report the effect estimate of a covariate as the effect in the model containing the block at the highest hierarchy. For instance, the effect estimate of sex, wealth, alcohol, and BMI will be derived from models 1, 2, 3 and 4, respectively.
We primarily modelled covariates in two categories: sex (male vs. female), educational level (secondary and tertiary vs. others), wealth index (richest third vs. lower two-thirds), urbanization (urban vs. rural), smoking (former and current smokers vs. never smokers), alcohol consumption (high risk vs. low risk), physical activity (lowest third vs. upper two-thirds) and BMI (≥ 25 vs. <25 kg m−2). We additionally tested polynomials of physical activity (model 3) and BMI (model 4) for potential non-linearity in relation to EAA. We performed collinearity tests for the fully adjusted EEA models and observed minimal evidence for multi-collinearity (variance inflation factor, VIF range 1.1–1.9).
We further performed subgroup analyses by age group (cut-off at median value; > 41 vs. ≤ 41 years), sex and urbanization as the major potential modifiers and tested between-group differences via multiplicative interaction terms between the EAA and each modifier in the model. In sensitivity analyses, we tested robustness of PhenoAA and GrimAA effect estimates to adjustment for leukocyte proportions, and performed partial Spearman correlations of the covariates with each DNAm surrogate component of GrimAA. Results of regression analyses are presented as changes in mean EAA and their 95% CIs in relation to each risk factor.
Associations between EAA and cardio-metabolic phenotypes
We regressed each of the cardiometabolic phenotypes—MetS, single components and severity score—on the standardized values of each EAA. We used binomial logistic regression for MetS and each component, and ordinal logistic regression for MetS severity. In these models, we had two levels of main covariate adjustments. First, we adjusted for chronological age. Second, we additionally adjusted for sex, socioeconomic and lifestyle factors corresponding to model 3 in the EAA outcome model. In sensitivity analyses, we additionally adjusted for BMI (all EAA) and leukocyte proportions (PhenoAA and GrimAA). We also regressed MetS outcomes on each DNAm surrogate of plasma protein components of GrimAA independently of demographic, socioeconomic and lifestyle factors, to identify specific associations with GrimAA components. We also performed collinearity tests for the fully adjusted MetS models and observed minimal evidence for multi-collinearity (VIF range 1.1–1.8).
We also tested effect modification by age group, sex and urbanization using the primary MetS logistic models, and tested between-group differences via multiplicative interaction terms between the EAA and each modifier in the model. Results of regression analyses are presented as ORs of MetS (or components) and 95% CIs in relation to each EAA measure.