ANALYSIS OF THE MOVIE DATABASE FILM RATING PREDICTION WITH ENSEMBLE LEARNING USING RANDOM FOREST REGRESSION METHOD
Abstract
The film industry has become a very profitable industry. However, during COVID-19 the film industry experienced an unfavorable impact with the delay in the screening schedule of new films, many cinemas were prohibited from operating so they were completely closed, and it wasn’t easy to obtain permits to carry out the filmmaking process. To survive in this industry from the impact of the pandemic, it is necessary to consider several factors such as targeted promotion methods by using the right selection of predictive decisions with market and trends. Predicting the success of a film is very helpful in determining the success rating and quality of the film to be released. The Random Forest Regression method is used to conduct predictive analysis on films. This study uses the M-estimate encoding technique to handle categorical data into numerical data, and the result shows that the application of M-estimate encoding increases the correlation value between features. In the Random Forest Regression method with 1000 trees, dividing 80% training data and 20% testing data, the R2 performance score was 86%, the MSE score was 12%, the RMSE score was 35% and the MAE score was 22%. The 10-fold cross-validation score in this study was 85%. This shows that the Random Forest Regression method using 80% training data produces the best performance score.
Downloads
References
C. Bruneel et al., “Movie Industry Economics: How Data Analytics Can Help Predict Movies’ Financial Success,” Nord. J. Media Manag. Issue, vol. 1, no. 3, pp. 339–359, 2020, doi: 10.5278/njmm.2597-0445.5871.
Y. Pusparisa, “Pertumbuhan Layar dan Bioskop Indonesia Tersendat Pandemi,” 30 Maret, 2021. https://databoks.katadata.co.id/media/statistik/f338238999e2a2c/pertumbuhan-layar-dan-bioskop-indonesia-tersendat-pandemi (accessed Nov. 10, 2022).
S. Sadya, “Daftar Film Indonesia yang Paling Banyak Ditonton pada 2022,” 20 Desember, 2022. https://dataindonesia.id/varia/detail/daftar-film-indonesia-yang-paling-banyak-ditonton-pada-2022 (accessed Nov. 10, 2022).
D. Muhamad Furqon, R. Ahmad Maulana, A. Fauzi, N. Dwi Cahya, M. Nur Sidiq, and T. Kurnia Sandi, “Prediksi Film Pilihan Penonton berdasarkan Genre, Aktor, dan Sutradara Berbasis Data Mining menggunakan Algoritma Eclat (Viewer Movie Predictions based on Genres, Actors, and Directors based on Data Mining Using the Eclat Algorithm),” Gunung Djati Conf. Ser., vol. 3, 2021.
Z. Mhowwala, A. R. Sulthana, and S. D. Shetty, “Movie rating prediction using ensemble learning algorithms,” Int. J. Adv. Comput. Sci. Appl., vol. 11, no. 8, pp. 383–388, 2020, doi: 10.14569/IJACSA.2020.0110849.
R. A. Abarja, “Movie Rating Prediction using Convolutional Neural Network based on Historical Values,” Int. J. Emerg. Trends Eng. Res., vol. 8, no. 5, pp. 2156–2164, 2020, doi: 10.30534/ijeter/2020/109852020.
A. V, E. G. Job, N. Sam, and S. M. Sebastian, “Movie success prediction using data mining,” GRD J. Eng., vol. 4, no. May, pp. 213–214, 2019.
O. I. Winanda and S. A. Zega, “Prediksi Rating Film Animasi Berdasarkan Elemen,” J. Appl. Multimed. Netw., vol. 1, pp. 15–26, 2019.
V. R. Prasetyo, M. Mercifia, A. Averina, L. Sunyoto, and Budiarjo, “Prediksi Rating Film Pada Website IMDB Menggunakan Metode Neural Network,” J. Ilm. Nero, vol. 7, no. 1, pp. 1–8, 2022.
G. A. Sandag, “Prediksi Rating Aplikasi App Store Menggunakan Algoritma Random Forest,” CogITo Smart J., vol. 6, no. 2, pp. 167–178, 2020, doi: 10.31154/cogito.v6i2.270.167-178.
“Let’s talk about TMDB.” https://www.themoviedb.org/about.
E. H. S. Atmaja, “Prediksi Kemenangan eSport DOTA 2 Berdasarkan Data Pertandingan,” Avitec, vol. 2, no. 1, pp. 1–8, 2020, doi: 10.28989/avitec.v2i1.612.
F. Mu’Alim and R. Hiday, “Implementasi Metode Random Forest Untuk Penjurusan Siswa Di Madrasah Aliyah Negeri Sintang,” Jupiter, vol. 14, no. 1, pp. 116–125, 2022, [Online]. Available: https://www.neliti.com/publications/441871/implementasi-metode-random-forest-untuk-penjurusan-siswa-di-madrasah-aliyah-nege#cite.
A. Syukron and A. Subekti, “Penerapan Metode Random Over-Under Sampling dan Random Forest Untuk Klasifikasi Penilaian Kredit,” J. Inform., vol. 5, no. 2, pp. 175–185, 2018, doi: 10.31311/ji.v5i2.4158.
Yoga Religia, Agung Nugroho, and Wahyu Hadikristanto, “Klasifikasi Analisis Perbandingan Algoritma Optimasi pada Random Forest untuk Klasifikasi Data Bank Marketing,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 1, pp. 187–192, 2021, doi: 10.29207/resti.v5i1.2813.
R. Sudiyarno, A. Setyanto, and E. T. Luthfi, “Peningkatan Performa Pendeteksian Anomali Menggunakan Ensemble Learning dan Feature Selection,” Creat. Inf. Technol. J., vol. 7, no. 1, p. 1, 2021, doi: 10.24076/citec.2020v7i1.238.
A. Primajaya and B. N. Sari, “Random Forest Algorithm for Prediction of Precipitation,” Indones. J. Artif. Intell. Data Min., vol. 1, no. 1, p. 27, 2018, doi: 10.24014/ijaidm.v1i1.4903.
D. Cahyanti, A. Rahmayani, and S. A. Husniar, “Analisis performa metode Knn pada Dataset pasien pengidap Kanker Payudara,” Indones. J. Data Sci., vol. 1, no. 2, pp. 39–43, 2020, doi: 10.33096/ijodas.v1i2.13.
S. Wahyuningsih and D. Retno Utari, “Perbandingan Metode K-Nearest Neighbor, Naïve Bayes dan Decision Tree untuk Prediksi Kelayakan Pemberian Kredit,” Konf. Nas. Sist. Inf. 2018 STMIK Atma Luhur Pangkalpinang, pp. 8–9, 2018, [Online]. Available: http://jurnal.atmaluhur.ac.id/index.php/knsi2018/article/view/424.
Z. Lyu et al., “Back-Propagation Neural Network Optimized by K-Fold Cross-Validation for Prediction of Torsional Strength of Reinforced Concrete Beam,” Materials (Basel)., vol. 15, no. 4, 2022, doi: 10.3390/ma15041477.
A. R. Yosafat and Y. Kurnia, “Aplikasi Prediksi Rating Film dengan Perbandingan Metode Naïve Bayes dan KNN Berbasis Website Menggunakan Framework Codeigniter,” J. ALGOR, vol. 1, no. 1, pp. 16–26, 2019, [Online]. Available: https://jurnal.ubd.ac.id/index.php/algor/article/view/221.
A. Bode, “Seleksi Fitur Untuk Prediksi Rating Film Hollywood Menggunakan Model K-Nearest Neighbor,” JUPITER J. Penerapan Ilmu-ilmu Komput., vol. 5, no. 1, 2019, [Online]. Available: https://ejournal.borobudur.ac.id/index.php/08/article/view/564.
A. Andreyestha and A. Subekti, “Analisa Sentiment Pada Ulasan Film Dengan Optimasi Ensemble Learning,” J. Inform., vol. 7, no. 1, pp. 15–23, 2020, doi: 10.31311/ji.v7i1.6171.
Copyright (c) 2025 Nuravifah Novembriana Marpid, Yogiek Indra Kurniawan, Swahesti Puspita Rahayu

This work is licensed under a Creative Commons Attribution 4.0 International License.