Conference Paper2025

Cross-Dataset Framework for Diabetes Prediction with KAN and MLP

Armun Alam, Jahanggir Hossain Setu, Syed Tangim Pasha, Nabarun Halder, Ashraful Islam, M. Ashraful Amin

Innovative Computing 2025, Volume 4

Springer Nature Singapore, pp. 66–72, ISBN: 978-981-96-8011-5

Abstract

Diabetes mellitus, a chronic condition caused by insufficient insulin production or poor cellular response to insulin, leads to elevated blood glucose levels and, if untreated, can severely affect organs, e.g., the heart, kidneys, and eyes. Diabetes is preventable and associated with lifestyle factors. Given the global rise in diabetes cases, effective early prediction is necessary. This study evaluates the effectiveness of advanced Machine Learning (ML) algorithms, i.e., Kolmogorov-Arnold Network (KAN) and Multi-Layer Perceptron (MLP), for diabetes classification across three datasets: Indian PIMA, DiaHealth Bangladesh, and Taiwan datasets. eXtreme Gradient Boosting (XGBoost) based feature selection technique identified core attributes for the model consistency, while KAN and MLP were trained on different dataset combinations and tested on a third, covering three configurations. Results indicate that KAN achieved higher accuracy (up to 74%) and F1-Score (up to 65%) in certain configurations, surpassing MLP in most cases. Although moderate precision and recall highlight potential data imbalance, KAN demonstrates promising results for diabetes prediction.