• Irfan Soliani Sistem Informasi, Fakultas Teknologi Informasi, Universitas Budi Luhur, DKI Jakarta, Indonesia
  • Safitri Juanita Sistem Informasi, Fakultas Teknologi Informasi, Universitas Budi Luhur, DKI Jakarta, Indonesia
Keywords: CRISP-DM, Davies-Bouldin Index, K-Means Algorithm, Prevalence of Disease


In 2019, the World Health Organization (WHO) stated that the top 10 types of diseases accounted for 55% of the 55.4 million deaths in the world. Meanwhile, in Indonesia, the province of West Java has the largest population, with the capital city of Bandung. Based on the health profile of the Bandung City Hospital, there were the ten highest diseases based on 18,147 cases. However, the data has not been processed into helpful information for the health department, especially the city of Bandung, to help determine disease cases by age group. So that the contribution of this study is to classify the prevalence of disease cases by age in Bandung City Hospital; this study aims to help the Bandung City Health Office take preventive, treatment and counselling actions against diseases that have a prevalence of disease cases based on age. This study uses the CRISP-DM methodology, with the K-Means clustering method and the testing method using the elbow method and the Davies-Bouldin Index (DBI). Data processing using rapid miner software and python programming. This study concludes that the optimal cluster value is K=2. The value of cluster 0 consists of the type of disease with the lowest case, and cluster 1 consists of the kind of disease with the highest case. Cluster 1 is the elderly and adult age group, while the age group in cluster 0 is the infant age group, the toddler age group, and the child age group.


