Plasma acylcarnitines and amino acids in dyslipidemia: an integrated metabolomics and machine learning approach
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Plasma acylcarnitines and amino acids in dyslipidemia: an integrated metabolomics and machine learning approach Ali Etemadi Tehran University of Medical Sciences Houra Mobaleghaleslam Tehran University of Medical Sciences Maryam Mirabolghasemi University of Tehran Mehdi Ahmadi Tehran University of Medical Sciences (TUMS) Hojat Dehghanbanadaki Tehran University of Medical Sciences Shaghayegh Hosseinkhani Tehran University of Medical Sciences Fatemeh Bandarian Tehran University of Medical Sciences Niloufar Najjar Tehran University of Medical Sciences Arezou Dilmaghani-Marand Tehran University of Medical Sciences Nekoo Panahi Tehran University of Medical Sciences Babak Negahdari Tehran University of Medical Sciences (TUMS) Mohammadali Mazloomi Tehran University of Medical Sciences (TUMS) Mohammad Hossein Karimi-jafari University of Tehran Farideh Razi Tehran University of Medical Sciences Bagher Larijani Tehran University of Medical Sciences Research Article Keywords: Mass Spectrometry, Metabolomics, Triglycerides, Dyslipidemia, Machine learning Posted Date: January 3rd, 2023 DOI: https://doi.org/10.21203/rs.3.rs-2400804/v1 License: This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License Additional Declarations: No competing interests reported. Version of Record: A version of this preprint was published at Journal of Diabetes & Metabolic Disorders on February 24th, 2024. See the published version at https://doi.org/10.1007/s40200-024-01384-9. Page 1/15
Abstract Background:The Discovery of underlying intermediates associated with the development of dyslipidemia results in a better understanding of pathophysiology of dyslipidemia and their modification will be a promising preventive and therapeutic strategy for the management of dyslipidemia. Methods: The entire dataset in this study was a large cross-sectional study that included 1200 subjects and was stratified into four binary classes with normal and abnormal cases based on their levels of triglyceride (TG), total cholesterol (TC), high-density lipoprotein cholesterol (HDL-C), and non-HDL-C. The current study sought to first evaluate plasma concentrations of 20 amino acids and 30 acylcarnitines in each class of dyslipidemia. Then, these attributes, along with baseline characteristics data, were used to check whether machine learning (ML) algorithms could classify cases and controls. Results: Taking this into account, the levels of dyslipidemia classes fluctuate during the day, which produces data fluctuation, our ML framework accurately predicts TG binary classes. Moreover, the findings showed that alanine, phenylalanine, methionine, C3, C14:2, and C16 had great power in differentiating patients with high TG from normal TG controls. Conclusions: The comprehensive output of this work, along with sex-specific attributes, will improve our understanding of the underlying intermediates involved in dyslipidemia. 1. introduction Dyslipidemia can be described as a situation in which patients have unbalanced concentrations of one or more of the following factors: high-density lipoprotein (HDL) cholesterol, low-density lipoprotein (LDL) cholesterol, total cholesterol (TC), and triglyceride (TG)(1, 2). The molecular basis behind dyslipidemia comes from insulin resistance and hyperinsulinemia, which cause obesity as the most common metabolic disorder. Insulin resistance pathways and dyslipidemia are both associated with adiposity, which is characterized by structural and functional changes in adipose tissue (3). It has been well-studied that dyslipidemia can be an augmentation factor for other diseases like atherosclerosis and cardiovascular diseases (CVD) (4). Also, it is stated that dyslipidemia may also be linked to obesity, type 2 diabetes mellitus, and certain types of cancer (5, 6). Therefore, early screening and effective lipid management are essential for improving the quality of life and reducing economic burden. Emerging metabolomics has provided new insights into precision medicine, such as the discovery of new intermediate markers and a better understanding of the underlying pathophysiology. With emerging metabolomics and the accumulation of omics data, there is still a need to extract valuable knowledge from various omics datasets. Machine learning enables researchers to recognize and extract patterns from this large amount of omics data and helps to develop optimal patterns that best explain the metabolomic alterations in dyslipidemia. Within this context, in this study, we reported a data-driven platform for combining targeted LC-MS/MS metabolomics with a machine-learning approach to evaluate fasting plasma amino acids and acylcarnitines in dyslipidemia. The present study proposes a new method based on a metabolomics approach with the application of metabolites along with gender-specific attributes to better understand the underlying intermediates involved in dyslipidemia. 2. methods 2-1-Data collection and experiments In this study, 1200 participants were randomly selected from our previous study on the Surveillance of Risk Factors of Noncommunicable Diseases (NCDs) in 30 provinces of Iran (STEPs 2016 Country report in Iran)1, which followed the WHO STEPwise approach to surveillance. The ethics committee of Endocrinology and Metabolism Research Institute, Tehran University of Medical Sciences (IR.TUMS.EMRI.REC. 1395.00141) approved the study protocol and performed it under the declaration of Helsinki. The purpose of the study was explained to the patients and written informed consent was obtained from all participants. Venous blood was collected in tubes containing sodium fluoride and EDTA acid after 12 h of fasting. Biochemical laboratory tests were performed using commercial Roche kits (Roche Diagnostics, Mannheim, Germany) and a Cobas C311 autoanalyzer. A portion of the plasma sample was isolated for metabolomic analyses. Tandem mass spectrometry A Thermo Scientific Dionex UltiMate 3000 standard HPLC system with a triple quadrupole mass spectrometer API 3200 (SCIEX) using positive electrospray ionization mode was used for flow injection of MS/MS Analysis in fasting plasma samples. After injection of a 5μL sample, a total of 50 metabolites, including 20 amino acids and 30 acylcarnitines, were analyzed. The mobile phase was a mixture of 75% aqueous acetonitrile. Data processing and metabolite quantification were performed using Multiquant software (ABI Sciex). For calibration and calculation of analyte concentrations, ratios of the signals of the metabolites relative to the isotopes (as internal standards) were used. The full protocol of the analytical procedures can be found in relative reference (7). 2-2-Data processing and analysis Page 2/15
Patients were classified into two groups (non-drug and drug-receiving) based on their history of receiving lipid-lowering medications during the study. Missing data and outliers in the non-drug receiving group were excluded. The data for TG, HDL, TC, and non–HDL cholesterol were labeled into binary classes based on this reference1. TC concentration ≥ 200 mg/dL (TC ≥ 5.2 mmol/L) was used to define hypercholesterolemia (TC class). High non-HDL cholesterol was defined as non–HDL ≥ 130 mg/dL (≥ 3.4 mmol/L). Serum HDL levels
A total of 1200 patients were selected from our previous study (1). The baseline characteristics of enrolled patients are presented in Table 1. In this study, 1094 patients had no history of receiving lipid-lowering drugs and were regarded as the non-drug receiving group. TG data analysis A single-point TG cut-off of 150 mg/dL was used to divide the TG group into two groups: normal (≤ 150 mg/dL) and abnormal (greater than 150 mg/dL). Among all patients in this study before sampling, 762 had normal TG levels and 321 had abnormal TG levels. After sampling data based on sex, age, and BMI, there were 235 and 205 samples in the TG normal and abnormal groups, respectively. The feature selection for TG Mann–Whitney U-test (Figure 1A and Table 2S) showed that the following amino acids in TG group were statistically higher in abnormal TG group compared to normal group: alanine( 459.63±96.25, 401.38±97.10,), glutamic acid (71.07±12.31 ،67.88±14.00), leucine (137.63±27.46, 121.87±25.85), phenylalanine(65.56±10.95, 63.92±11.19), tyrosine(74.80±15.50, 71.19±14.33), valine (288.89±53.70, 253.20±48.57), proline (271.33±84.09, 251.04±82.55), lysine(189.42±48.39, 180.57±43.30), tryptophan (74.27±16.63, 68.78±14.52). The analysis showed that glycine (256.41±80.06, 273.65±77.93), serine (97.77±26.87, 109.46±30.20), and asparagine (45.52±17.22, 49.51±20.01), were decreased in abnormal TG group compared to normal TG group. Also, among acylcarnitines, the test showed that in the abnormal TG group C0(59.40±12.82, 55.06±12.41), C3(0.99±0.43, 0.83±0.34), C16(0.20±0.06, 0.18±0.05), C18:2OH(0.04±0.03, 0.03±0.02) were statistically higher than the normal group. Furthermore, C4OH(0.05±0.02, 0.06±0.02), C8(0.31±0.36, 0.37±0.43), C8:1(0.30±0.17, 0.35±0.18), C10(0.39±0.37, 0.47±0.46), C10:1(0.36±0.33, 0.44±0.37), C14:2(0.09±0.04, 0.11±0.05) were statistically decreased in abnormal TG group compared to the normal group. The data in parentheses show the mean and standard deviation of the normal and abnormal groups, respectively, for each factor. Surprisingly, the mean and standard deviation of participants’ height in the TG normal group (161.60±10.63) were statistically lower than those in the TG abnormal group (164.19±9.90). The alanine aminotransferase (ALT) in the normal group (19.51±8.69) was significantly lower than that in the abnormal group (23.82±14.85). In this study, both selectKBest and RFECV were used to identify the top 10 optimal features with the highest weights for classifying TG groups. SelectKBest, which is a univariate feature selection method, showed that alanine, leucine, tyrosine, valine, glycine, proline, serine, tryptophan, asparagine, and diastolic blood pressure had the most relevant features for TG group classification. The optimal features extracted using RFECV were C0, C14:2, alanine, leucine, valine, threonine, serine, tryptophan, asparagine, and weight. Pearson's correlation coefficients were used to measure the strength of the linear relationship between two random variables (Figure S1) and TG (Figure 1B). As shown in Figure 1B, valine, leucine, and alanine had TG correlations greater than 0.3. The inter-correlation of factors was also checked by Pearson's correlation coefficients (Figure S1) and scores greater than 0.6 were considered to indicate a strong correlation. The analysis showed that C14:1 with C16 and C10:1, C14:2 with C14:1 and C10:1, C18 with C16, valine with tyrosine, and leucine had a strong correlation, with Pearson scores greater than 0.6. Furthermore, Point Biserial Correlation (Figure S1) showed that alanine, leucine, valine, serine, tryptophan, and weight had the highest correlation (greater than 0.3) with the TG classes. Machine learning for TG classification A dataset containing data for 440 instances (235 samples with normal TG scores and 205 samples with abnormal TG scores) was used for TG classification. Data in both the normal and abnormal groups were adjusted by age (mean 53.76 and 53.93 years old, respectively), sex (frequency 220 and 220, respectively), and BMI (mean 28.6 and 28.5, respectively), and the differences between them in each group were not statistically significant (p-value > 0.05). Based on feature selection methods, a combination of different feature sets was used, and the highest accuracy was achieved when all statistically significant features were used. This feature set for TG classification had 25 independent attributes and the target feature (a dependent feature) (Figure S1). The target was labeled as either 0 or 1, where 0 was defined as a person with a normal TG value and 1 as a person with an abnormal TG value. For TG classification, 21 ML models were used, and the five top models based on ROC curves were Logistic Regression (LR), Support Vector Classification (SVC), Linear Support Vector Classifier (LSVC), Random Forest (RF), and Linear Discriminant Analysis (LDA). A comparison of the top models based on the ROC curve with all 25 independent attributes is shown in Figure 2A. These top five models showed satisfactory TG classification performance, with AUCs ranging between 0.76 and 0.81. The data showed that the SVM model (with AUC = 0.81, and standard deviation of test accuracy = 0.04) performed slightly better than the other models and was considered the optimal model for TG classification. Furthermore, the mean CV score (K=10), recall (true positive/true positive+ false negative), precision (true positive/true positive+ false positive), F1, and standard deviation of the test accuracy for this model were 0.69, 0.7, 0.72, 0.71, and 0.04, respectively. In terms of precision (True positive/True positive+ False positive), LDA, with a precision of 0.73(standard deviation of test accuracy = 0.05), had better performance. However, the data showed that LSVC had the highest recall score (recall = 0.79 and standard deviation of test accuracy = 0.07). In addition, the analysis demonstrated that LSVC retained a strong predictive performance for F1 with a score of 0.75. Table S3 summarizes all characteristics of the top five models. Feature importance for top 5 models in TG class Page 4/15
The ELI5 library (15) (Accessed:2022-06-25) was used to extract feature importance for each model. Figure 2B and 2C show the top 10 most important features of the SVM and LR used for prediction, respectively. Six features were common to both SVM and LR feature importance (Figure 2D): alanine, phenylalanine, methionine, C16, C14:2, and C3. For both SVM and LR (Figures 2B and 2C), we also assessed the models based on a 2×2 confusion matrix with true TG labels on one axis and predicted TG results on the other axis. The matrix showed that both the SVM and LR performed better in predicting abnormal TG classes. Predicted abnormal TG values (true positive) in the matrix in SVM and LR were 0.8 and 0.79, respectively. For normal TG (true negative), these scores were 0.72 and 0.73, respectively. Data for TC, HDL, and non-HDL cholesterol Data processing methods for the TC, HDL, and non-HDL cholesterol groups were the same as those used for the TG group. After excluding missing data points and outliers from each class, matching (sampling) of cases and controls in the groups was performed based on age, sex, and BMI. The number of controls in the TC, HDL, and non-HDL cholesterol groups after sampling was 187, 271, and 302, respectively. Furthermore, after sampling, in the abnormal group there were 162, 245, and 272 cases in TC, HDL, and non-HDL cholesterol groups, respectively. Three feature selection methods (statistical analysis using the Mann–Whitney U-test, Selectkbest with Chi-Square, and RFECV) were used to extract a combination of important features (Table S4). Table S4 summarizes all extracted features using the three mentioned methods for the TG, TC, HDL, and non- HDL cholesterol groups. The Pearson correlation coefficients of all HDL features are shown in Figure S2. The ML data for HDL classification using statistically significant attributes (Figure S3A) showed that the top prediction model was a Random Forest with an AUC score of 0.73. The prediction power of the model was slightly more substantial for abnormal patients, as depicted in Figure S3B using a confusion matrix. The top features (asparagine, C3, C0, glutamic acid, alanine, and C5) for HDL classification using RF are plotted in Figure S3C. Non-HDL cholesterol classification data showed that the most satisfactory prediction accuracy was for the SVM model, with an AUC score of 0.72, as shown in Figure S4A. The top features proposed by ELI5 feature importance were C16, valine, and C18:1 (Figure S4B and S6C). The normalized confusion matrix showed that the model was stronger in the normal non-HDL-C group (Figure S4B). The Pearson correlation coefficients of all features for HDL and TC levels are shown in Figures S5 and S6, respectively. It seems that machine learning models cannot classify cases and controls in TC. The accurate classification model was RF, with an AUC of 0.61 (Figure S7A). The top feature proposed by ELI5 was tryptophan (Figure S7B and S9C). Point biserial correlations with features also showed that Alanine and C18 had the highest correlation with TC classes (Figure S8). Association between gender and dyslipidemia In this study, we modified the association of all attributes and dyslipidemia by sex to determine which factors were more dominant in favor of either males or females. These gender-specific attributes may help in screening programs. Figure 3 shows the p-values for sex-specific attributes in each class of dyslipidemia. The analysis of the data (Figure 3A) showed that in females, patients with abnormal TG had lower levels of serine (Figure 3B), asparagine, threonine, glycine, citrulline, C14:2, C14:1, C10, C8:1, C8, and C5:DC and higher levels of proline, glutamic acid, and C18:2OH (and also height) than the normal TG group, whereas these metabolites showed no differences in the male population between the abnormal TG and normal TG groups. In the male population, patients with abnormal TG levels had lower levels of C3 and higher levels of lysine, histidine, C18, C5:OH, C5, and C3(Figure 3B) ( as well as ALT) than the normal TG group, whereas there were no differences in the female population between the abnormal TG and normal TG groups. Furthermore, we found that both males and females with abnormal TG levels had lower levels of alanine (Figure 3B) and tryptophan and higher levels of C0, C16, C10:1, leucine, and valine than the normal TG group. For women in the TC group, the waist-hip ratio and aspartic acid level were significantly higher in the abnormal and normal groups, respectively. In contrast, males in the normal TC group had significantly lower tryptophan, proline, ornithine, and alanine levels than males with abnormal TC levels (Figure 3C). The data suggested that females with normal HDL had statistically lower ALT, waist-hip, and higher alanine, citrulline, arginine, and C14OH levels than the abnormal group (Figure 3D). Furthermore, higher asparagine, serine, threonine, and glycine levels and lower mean diastolic pressure were observed in females with normal non-HDL. In this group, males with abnormal non-HDL showed significantly higher levels of tryptophan, ornithine, tyrosine, leucine, alanine, C18, C16, C5OH, C5, and C0 than males with normal non-HDL (Figure 3E). To check for differences in attributes based on gender in normal and abnormal groups, data were stratified by outcomes(normal/abnormal), and p-values were calculated for males and females in both normal and abnormal TG, TC, non-HDL, and HDL groups (Figure S9A). For simplicity, the data for the TG groups are reported here, and the full details are available in Figure S9A-E. In the normal TG group, the concentrations of asparagine, serine, glycine, and C18:1 were significantly different between men and women. Furthermore, WC, histidine, tyrosine, aspartic acid, C14:2, C14, C12, and C0 levels were statistically significant in the abnormal TG groups according to sex (Figure S9B). In both normal and abnormal TG, ALT, waist-hip, HC, BMI, height, tryptophan, proline, citrulline, valine, phenylalanine, methionine, leucine, glutamic acid, C18, C16:1, C5DC, C5:OH, C5, C4DC, C3, and C3DC were significantly different between male and female. Page 5/15
Pathway and Metabolite enrichment analysis According to Metabolite enrichment analysis, 33 pathways were enriched (Table S1). These pathways included aminoacyl-tRNA biosynthesis, valine-leucine, and isoleucine biosynthesis, alanine, aspartate, and glutamate metabolism, arginine biosynthesis, glyoxylate and dicarboxylate metabolism, glycine, serine and threonine metabolism, histidine metabolism, phenylalanine, tyrosine and tryptophan biosynthesis, Pantothenate and CoA biosynthesis, D-Glutamine and D-glutamate metabolism, nitrogen metabolism, glutathione metabolism, phenylalanine metabolism, cysteine, and methionine metabolism in compliance with adjusted p-value < 0.05. This pathway is illustrated in Figure 4. Network-based analysis determines the relationship between the metabolites listed in Table S1 and the enzymes and reactions involved in the metabolism of these compounds. The network contains 185 nodes and 315 edges. To discover the most significant nodes in terms of those that have the most impact on the network, a cytoHubba application was applied. Protein digestion and absorption(hsa04974), ABC transporters(hsa02010), mineral absorption(hsa04978), and pancreatic secretion(hsa04972) in pathways cluster and Na+/K+-exchanging ATPase (7.2.2.13) and triacylglycerol lipase (3.1.1.3) in the class of enzymes and L-Phenylalanine(C00079), L-Serine(C00065), L-Aspartic acid(C00049), L-Leucine(C00123), L-Glutamic acid(C00025), L-Valine(C00183), L- Alanine(C00041), Glycine(C00037), L-Glutamine(C000640), L-Tryptophan(C000780), L-Proline(C00148) and Triacylglycerol(C00422) in metabolite group are the most critical nodes in term of the betweenness value (Figure 5). Analysis of drug-receiving groups In the drug-receiving group (patients with a history of receiving lipid-lowering drugs), there were 106 patients. In this group, the concentrations of C18, TG, TC, non-HDL cholesterol, and HDL were statistically significant in the two LDL classes divided by the LDL cutoff level of 100 mg/dL (Figure S10). In addition, in TC classes divided by a cut-off level of 200 mg/dL, the Mann-Whitney (independent samples) test showed that C4DC, C18, serine, asparagine, TG, non-HDL cholesterol, and HDL were statistically significant (Figure S11). 4. discussion Our previous study showed that serum lipid levels in adult Iranian populations were critically at dangerous levels, with a report of eight out of ten people with undesired serum lipid levels (1). The findings showed that the prevalence of lipid abnormalities of low HDL, non–HDL cholesterol, hypertriglyceridemia, and hypercholesterolemia in adult Iranian populations was 60%, 39.5%, 28.0%, and 26.7%, respectively. The systematic analysis and study of metabolites in biological samples is a part of metabolomics. Using metabolites such as amino acids and acylcarnitines, metabolomics can be used to extract and provide useful knowledge from normal and abnormal samples, which eventually helps us explore pathophysiological conditions and the molecular mechanisms of some diseases and disorders (16, 17). To the best of our knowledge, this is the first study to systematically investigate amino acids and acylcarnitines as risk markers for dyslipidemia using LC- MS/MS. These factors have not been extensively studied in the field of precision medicine. There is insufficient evidence regarding machine learning applications for dyslipidemia classification. A novel and significant contribution of this study is a way to solve two-class classification (normal or abnormal) based on data on the concentrations of amino acids and acylcarnitines in plasma. The main constituents of the human lipid fraction are cholesterol, TGs, and high-density lipoproteins. Studies showed that the fluctuation in lipid and lipoprotein levels daily and even hourly are often encountered in hyperlipidemic patients (18). For example, in response to meals, TGs change dramatically, becoming 5–10 times higher than fasting levels just a few hours after eating. Although our sampling methods used fasting plasma to exclude fluctuations, it seems that even fasting levels vary considerably from day to day, and these modest changes in fasting TG levels might cause huge problems in machine learning algorithms. We attempted to exclude outlier data points from our data frame to reduce potential data sparsity and noise. ML prediction showed that by using amino acids and acylcarnitines as attributes, only TG classification had satisfactory accuracy. In this regard, a study conducted by Yousri et al. on the relationship between metabolite levels and dyslipidemia reported that TG was the most significantly perturbed lipid pathway (19). The goal of this study was to classify the TG group based on a single-point TG cut-off of 150 mg/dL. Alanine, glutamic acid, leucine, phenylalanine, tyrosine, valine, proline, lysine, and tryptophan were significantly increased in the abnormal TG group compared to the normal group, whereas glycine, serine, and asparagine showed a decreasing trend. After adjusting the data based on sex, age, and BMI, the findings showed that there were several important features that had the highest classification weights and differences between the normal and abnormal groups. Alanine, Leucine, and Valine showed the highest differences between the normal and abnormal TG groups based on both the MW-U test and Pearson’s correlation coefficient. The concentrations of these three features were highly increased in the abnormal TG groups compared to those in the normal group, which is in accordance with the study of Yousri and coworkers (19). In another study, for both sexes, the amounts of valine and leucine positively correlated with TG levels and negatively correlated with HDL cholesterol levels (20). Rose et al. showed that fasting serum TG levels were significantly higher after oral administered L-alanine(21). They suggested that this alteration in serum TG levels may have been due to increased alanine metabolism to pyruvate and its incorporation into lipids under insulin stimulation. Wiklund et al. examined the association between TG concentrations and serum amino acid profiles during pubertal growth to predict hypertriglyceridemia in early adulthood(22). Although this was studied in girls, they found that serum leucine and isoleucine levels correlated significantly with future TG levels. As the underlying mechanism for the observed elevation, in the state of obesity, a decline in the catabolism of valine and leucine in adipose tissue can lead to an increase in their circulating levels. In addition, readily usable lipid and glucose substrates can avoid the requirement of amino acids for metabolism in Page 6/15
adipose tissue (23). In this study, 347 participants had BMI > 25. With regard to acylcarnitines, C0, C3, C16, and C18:2OH increased in the abnormal group. Furthermore, C4OH, C8, C8:1, C10, C10:1, and C14:2 levels were significantly lower in the abnormal TG group than in the normal group. C3 acylcarnitine, levels of which are increased in abnormal TG groups compared to normal ones, is a byproduct of valine and isoleucine amino acids(24). A significantly increased alanine aminotransferase (ALT) level, which is a common laboratory marker for underlying chronic liver disease (25) was also reported in the abnormal TG group. These findings are in line with those of Chen et al.'s (26) results. They proposed that serum ALT levels were independently correlated with the hepatic TG content in obese subjects. They also mentioned that ALT level might be more appropriate as a predictor for the degree of non- alcoholic fatty liver disease (NAFLD) than aspartate aminotransferase (AST) and gamma-glutamyltransferase (GGT). Although the direct determination of dyslipidemia factors in the laboratory is the most accurate and preferred method, when this is not available, machine learning can help with less computationally expensive methods and shorter time frames. Here, we showed that in both SVC and LR models, TG values can be accurately classified into normal and abnormal classes based on plasma concentrations of amino acids and acylcarnitine. Both models were applied to the statistically significant features. Alanine, Phenylalanine, Methionine, C3, C14:2, and C16 all had a statistically significant effect on TG classification according to the ELI5 library for extracting feature importance from the SVC and LR models. Our findings showed that in the drug-receiving groups, the concentration of acylcarnitine C18 was statistically significant in groups stratified by both LDL and cholesterol. In both groups of people with abnormal LDL and cholesterol levels, C18 concentrations were higher than those in the normal groups. C18 is a long chain acylcarnitine. Elevated levels of C18 have been shown in people with carnitine-palmitoyl transferase-2 deficiency disorder, which is the most common inherited disorder of lipid metabolism in adults(27). Similar to C18, C4DC concentrations were significantly higher in the patients with abnormal HDL levels. Serine and asparagine showed lower concentrations in the abnormal cholesterol groups than in the drug-receiving groups. Despite the fact that there are not enough published studies with adequate data stratified by sex, there is strong evidence that sex, as an endogenous factor, influences metabolism, incidence or severity of diseases, and therapy (16, 28). In this study, we also examined the relationship between sex and dyslipidemia in different classes, which can provide valuable information through precision medicine. Conclusion The comprehensive output of this study, along with gender-specific attributes, provides a better understanding of metabolite dysregulation in dyslipidemia. Machine learning modeling has introduced several highly accurate models for the detection of patients with abnormal TG levels. Alanine, phenylalanine, C16, methionine, C14:2, and C3 were the common diagnostic metabolites in the two most accurate models. The metabolic pathways that have the greatest impact on abnormal TG development are valine, leucine, and isoleucine biosynthesis; phenylalanine, tyrosine, and tryptophan biosynthesis; aminoacyl-tRNA biosynthesis; D-Glutamine and D-glutamate metabolism; and arginine biosynthesis. Acylcarnitines name Free carnitine (C0), acetylcarnitine (C2), propionylcarnitine (C3), Malonylcarnitine (C3-DC), butyrylcarnitine (C4), Methylmalonyl-/succinylcarnitine (C4-DC), 3- OH-iso-/butyrylcarnitine (C4-OH), isovalerylcarnitine (C5), Tiglylcarnitine (C5:1), 3-OH-isovalerylcarnitine (C5-OH), glutarylcarnitine (C5DC), hexanoylcarnitine (C6), octanoylcarnitine (C8), Octenoylcarnitine (C8:1), decanoylcarnitine (C10), Decenoylcarnitine (C10:1), dodecanoylcarnitine (C12), tetradecanoylcarnitine (C14), Tetradecenoylcarnitine (C14:1), Tetradecadienoylcarnitine (C14:2), 3-OH-tetradecanoylcarnitine (C14-OH), hexadecanoylcarnitine (C16), 3-OH- hexadecanoylcarnitine (C16-OH), 3-OH-hexadecenoylcarnitine (C16:1-OH), Hexadecenoylcarnitine (C16:1), octadecanoylcarnitine (C18), Octadecenoylcarnitine (C18:1), 3-OH-octadecanoylcarnitine (C18-OH), 3-OH-octadecenoylcarnitine (C18:1-OH), Octadecadienoylcarnitine (C18:2). Abbreviations body mass index (BMI), false discovery rate (FDR), branched-chain amino acids (BCAA), aromatic amino acids (AAA), triglyceride (TG), total cholesterol (TC), low plasma high-density lipoprotein cholesterol(HDL-C), machine learning (ML), liquid chromatography-tandem mass spectrometry (LC-MS/MS), Mann– Whitney U-test(MWU), alanine aminotransferase (ALT), receiver operator characteristic curves (ROC) Declarations Ethics approval and consent to participate The ethics committee of Endocrinology and Metabolism Research Institute, Tehran University of Medical Sciences (IR.TUMS.EMRI.REC. 1395.00141) approved the study protocol and performed it under the declaration of Helsinki. Consent for publication The purpose of the study was explained to the patients and written informed consent was obtained from all participants. Availability of data and materials Page 7/15
The datasets generated and/or analyzed during the current study are available from the corresponding author upon reasonable request. Competing interests The authors declare that they have no competing interests. Funding N/A Authors' contributions F.R., A.E., H.D, S.H., H.M, contributed to the study conception and design. B.A., S.MF., S.AM. provided study patients and monitored data and specimen collection. N.N., A.DM, Sh.H. performed the experiments. A.E., F.R., M.A., H.M., M.M., M.KJ. analyzed the data. A.E., H.M., M.A., Sh.H., H.DB. wrote the manuscript. All authors read and approved the final manuscript. References 1. Aryan Z, Mahmoudi N, Sheidaei A, Rezaei S, Mahmoudi Z, Gohari K, et al. The prevalence, awareness, and treatment of lipid abnormalities in Iranian adults: Surveillance of risk factors of noncommunicable diseases in Iran 2016. J Clin Lipidol. 2018 Dec;12(6):1471-1481.e4. 2. National Cholesterol Education Program (NCEP) Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults (Adult Treatment Panel III). Third Report of the National Cholesterol Education Program (NCEP) Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults (Adult Treatment Panel III) final report. Circulation. 2002 Dec 17;106(25):3143–421. 3. Blüher M. Adipose tissue dysfunction contributes to obesity related metabolic diseases. Best Pract Res Clin Endocrinol Metab. 2013 Apr 1;27(2):163–77. 4. Lin CF, Chang YH, Chien SC, Lin YH, Yeh HY. Epidemiology of Dyslipidemia in the Asia Pacific Region. Int J Gerontol. 2018 Mar 1;12(1):2–6. 5. Vekic J, Zeljkovic A, Stefanovic A, Jelic-Ivanovic Z, Spasojevic-Kalimanovska V. Obesity and dyslipidemia. Metabolism. 2019 Mar 1;92:71–81. 6. Johnson CB, Davis MK, Law A, Sulpher J. Shared Risk Factors for Cardiovascular Disease and Cancer: Implications for Preventive Health and Clinical Care in Oncology Patients. Can J Cardiol. 2016 Jul;32(7):900–7. 7. Esmati P, Najjar N, Emamgholipour S, Hosseinkhani S, Arjmand B, Soleimani A, et al. Mass spectrometry with derivatization method for concurrent measurement of amino acids and acylcarnitines in plasma of diabetic type 2 patients with diabetic nephropathy. J Diabetes Metab Disord. 2021 Jun;20(1):591–9. 8. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. Mach Learn PYTHON. :6. 9. SHAPIRO SS, WILK MB. An analysis of variance test for normality (complete samples)†. Biometrika. 1965 Dec 1;52(3–4):591–611. 10. Freedman D, Pisani R, Purves R. Statistics: Fourth International Student Edition. W W Nort Co Httpswww Amaz ComStatistics-Fourth-Int-Stud-Free Accessed. 2020;22. 11. Gu Z, Eils R, Schlesner M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinforma Oxf Engl. 2016 Sep 15;32(18):2847–9. 12. Evaluation of Feature Selections on Movie Reviews Sentiment | IEEE Conference Publication | IEEE Xplore [Internet]. [cited 2022 Sep 3]. Available from: https://ieeexplore.ieee.org/document/9234287 13. FELLA: an R package to enrich metabolomics data | BMC Bioinformatics | Full Text [Internet]. [cited 2022 Nov 15]. Available from: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2487-5 14. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003 Nov;13(11):2498–504. 15. Korobov M, Lopuhin K. ELI5 Documentation. :113. 16. Costanzo M, Caterino M, Sotgiu G, Ruoppolo M, Franconi F, Campesi I. Sex differences in the human metabolome. Biol Sex Differ. 2022 Jun 15;13(1):30. 17. Beger RD, Schmidt MA, Kaddurah-Daouk R. Current Concepts in Pharmacometabolomics, Biomarker Discovery, and Precision Medicine. Metabolites. 2020 Mar 27;10(4):E129. 18. Weintraub MS, Grosskopf I, Charach G, Eckstein N, Ringel Y, Maharshak N, et al. Fluctuations of Lipid and Lipoprotein Levels in Hyperlipidemic Postmenopausal Women Receiving Hormone Replacement Therapy. Arch Intern Med. 1998 Sep 14;158(16):1803–6. 19. Yousri NA, Suhre K, Yassin E, Al-Shakaki A, Robay A, Elshafei M, et al. Metabolic and Metabo-Clinical Signatures of Type 2 Diabetes, Obesity, Retinopathy, and Dyslipidemia. Diabetes. 2022 Feb 1;71(2):184–205. 20. Fukagawa NK, Martin JM, Wurthmann A, Prue AH, Ebenstein D, O’Rourke B. Sex-related differences in methionine metabolism and plasma homocysteine concentrations. Am J Clin Nutr. 2000 Jul;72(1):22–9. 21. Rose DP, Leklem JE, Fardal L, Baron RB, Shrago E. Effect of oral alanine loads on the serum triglycerides of oral contraceptive users and normal subjects. Am J Clin Nutr. 1977 May;30(5):691–4. 22. Wiklund P, Zhang X, Tan X, Keinänen-Kiukaanniemi S, Alen M, Cheng S. Serum Amino Acid Profiles in Childhood Predict Triglyceride Level in Adulthood: A 7-Year Longitudinal Study in Girls. J Clin Endocrinol Metab. 2016 May;101(5):2047–55. 23. Newgard CB. Interplay between lipids and branched-chain amino acids in development of insulin resistance. Cell Metab. 2012 May 2;15(5):606–14. Page 8/15
24. Newgard CB, An J, Bain JR, Muehlbauer MJ, Stevens RD, Lien LF, et al. A Branched-Chain Amino Acid-Related Metabolic Signature that Differentiates Obese and Lean Humans and Contributes to Insulin Resistance. Cell Metab. 2009 Apr;9(4):311–26. 25. Siddiqui MS, Sterling RK, Luketic VA, Puri P, Stravitz RT, Bouneva I, et al. Association between high-normal levels of alanine aminotransferase and risk factors for atherogenesis. Gastroenterology. 2013 Dec;145(6):1271-1279.e1-3. 26. Chen Z, Han CK, Pan LL, Zhang HJ, Ma ZM, Huang ZF, et al. Serum alanine aminotransferase independently correlates with intrahepatic triglyceride contents in obese subjects. Dig Dis Sci. 2014 Oct;59(10):2470–6. 27. Adeva-Andany MM, Calvo-Castro I, Fernández-Fernández C, Donapetry-García C, Pedre-Piñeiro AM. Significance of l-carnitine for human health. IUBMB Life. 2017;69(8):578–94. 28. F MJ, Hk B, I C, Jj C, S D, F F, et al. Sex- and Gender-Based Pharmacological Response to Drugs. Pharmacol Rev [Internet]. 2021 Apr [cited 2022 Sep 13];73(2). Available from: https://pubmed.ncbi.nlm.nih.gov/33653873/?dopt=Abstract Tables Table 1. The baseline characteristics of the study participants were classified into different classes of dyslipidemia. Page 9/15
Variables TG N* TG A p- TC N (N=186) TC A (N=161) p- HDL N HDL A p- non-HDL (n=235) (n=205) value value (N=271) (N=245) value cholesterol (N=302) Age 53.76±10.54 53.93±10.60 0.8 55.94±10.82 56.22±10.75 0.7 57.15±12.14 56.81±11.95 0.9 55.77±11.80 (year) Gende 0.8 0.5 0.8 r(n): Female 119 101 117 95 136 124 152 Male 116 104 69 66 135 121 150 Area (n): 0.7 0.3 0.13 Rural 81 64 61 67 96 86 102 Urban 154 141 125 94 175 159 200 Years of Education (n) 0 58 35 0.3 50 49 0.5 69 63 0.001 75 1-6 79 77 67 44 91 86 116 7-12 69 68 49 40 71 63 82 >12 29 25 20 28 40 33 29 HTN 0.2 0.9 0.5 treatment (n) 194 178 156 134 233 210 251 No Yes 41 27 30 27 38 35 51 28.60±4.67 28.48±4.74 0.7 28.17±4.95 28.04±5.21 0.8 26.88±4.87 27.08±4.87 0.6 27.78±5.00 BMI (Kg/m²)
HbA1c 5.83±1.29 5.95±1.13 0.002 5.76±1.02 5.88± 1.05 0.04 5.68±1.06 5.83±1.00 0.005 5.78±1.14 (%) GLU 100.18±37.40 106.63±40.90 0.004 101.74±33.23 101.19±36.75 0.62 96.74±27.20 102.94±36.34 0.02 100.59±34.0 (mg/dL) NHC 118.94±29.37 149.51±33.65 6E- 119.13±25.32 177.50±20.33 2E- 122.60±34.13 130.87±34.11 0.009 103.93±18.4 (mg/dL) 20 42 * Continuous variables are presented as mean± SD, and categorical variables are presented as the number of each variable. N=normal, A=abnormal. HTN: hypertension, BMI: body mass index, WC: waist circumference, HC: hip circumference, BP: blood pressure, TG: Triglycerides, NHC: Non-HDL cholesterol, GLU: glucose, Figures Figure 1 Mann–Whitney U-test p-values (A) and Pearson’s correlation coefficient(r) (B) for all studied features in the TG, TC, non-HDL cholesterol, and HDL groups based on normal and abnormal categories. Page 11/15
Figure 2 (A) Receiver operator characteristic curves (ROC) for TG classification. (B) Feature importance and confusion matrix for the SVM model for TG classification. (C) Feature importance and confusion matrix for the LR model in TG classification. (D) Alanine, phenylalanine, C16, methionine, C14:2, and C3 are common features extracted using both the SVM and LR models. Page 12/15
Figure 3 (A) P-values stratified by sex comparing outcomes(normal/abnormal) for TG, TC, non-HDL, and HDL groups. The box plot shows the most significant features in the TG (B), TC (C), non-HDL (D), and HDL (E) groups. Page 13/15
Figure 4 KEGG pathway analysis based on p-values and enrichment ratios Page 14/15
Figure 5 CytoHubba represents the most significant node in terms of betweenness in the network and was obtained using FELLA. Supplementary Files This is a list of supplementary files associated with this preprint. Click to download. SupplementaryDyslipidemiaetemadi.docx Page 15/15
You can also read