Implementasi Metode Improved K-Means dengan Algoritma Dbscan untuk Pengelompokan Film

  • muhammad muhajir Universitas Islam Indonesia
  • Annisa Ayunda Permata Sari Universitas Islam Indonesia
Keywords: Movie, Imdb, Kmeans, Dbscan

Abstract

The Indonesian film industry continues to experience an increase seen from the number of films that appear in theaters today with a box office increase of 28 percent each year in the past four years. Internet Movie Database (IMDb) is a website that provides information about films around the world, including the people involved in it from actors, directors, writers to makeup artists and soundtracks. In this case the researcher wants to conduct research on the characteristics of the film and the factors that make a film to be included in the IMDb Top 250. The data used in this study uses scraped data from the website. The method used is a non-hierarchical clustering method, namely kmeans and Dbscan. Where the Dbscan algorithm is used to determine the optimum number of clusters then proceed by grouping data based on centroids with k-means algorithm. From the analysis it was found that the factors that could influence a film included in the IMDB Top 250 were duration, number of votes, and films directed by Rajkumar Hirani and the optimal number of clusters using Dbscan algorithm obtained six clusters. With the improved k-means algorithm, the accuracy value for the cluster results is 87.2%.

References

[1] Portal Informasi Indonesia. (2019). Tren Positif Film Indonesia. https://indonesia.go.id/ragam/seni/sosial/tren-positif-film-indonesia. Diakses pada 13 Februari 2020.
[2] Hype Stat. (2020). Imdb.Com – Info. https://hypestat.com/info/imdb.com#info. Diakses pada 13 Februari 2020.
[3] Ibrahim, I. S. (2011). Budaya Populer sebagai Komunikasi; Dinamika Popscape dan. Yogyakarta: Jalasutra.
[4] Feldman, Ronen, Sanger, & dkk. (2007). The Text Mining Handbook Advanced Approaches in Analyzing Unstructured Data. New York: Cambridge University Press.
[5] Putra, A. A. (2016). Implementasi Text Summarization Menggunakan Metode Vector Space Model Pada Artikel Berita Berbahasa Indonesia.
[6] Fitri, Meisya. (2013). Perancangan Sistem Temu Balik Informasi Dengan Metode Pembobotan Kombinasi Tf-Idf Untuk Pencarian Dokumen Berbahasa Indonesia. Universitas Tanjungpura: Semarang.
[7] Supranto, J. (2004). Analisis Multivariat: Arti dan Interpretasi. Jakarta: PT. Rineka Cipta.
[8] Jolluffe, I. T. (2002). Principal Component Analysis 2nd Edition. New York: Springer-Verlag
[9] Triyanto, W. A. (2015). Algoritma K-Medoids untuk Penentuan Strategi. Jurnal SIMETRIS, Vol. 6 No.1 April 2015 183-188.
[10] Misra, S., Li, H., & He, J. (2019). Machine Learning for Subsurface Characterization. Cambridge: Elsevier Inc.
[11] Bari, A., Chaouchi, M., & Jung, T. (2014). Predictive Analytics For Dummies. New Jersey: John Wiley & Sons, Inc.
[12] Asroni, & Adrian, R. (2015). Penerapan Metode K-Means Untuk Clustering Mahasiswa Berdasarkan Nilai Akademik dengan Weka Interface Studi Kasus Pada Jurusan Teknik Informatika UMM Magelang. Jurnal Ilmiah Semesta Teknika, 78
[13] Guang-ping, C., & Wen-peng, W. (2012). Improved K-means Algorithm with Meliorated Initial Center. The 7th International Conference on Computer Science & Education, Volume 12, 150-153.
[14] Narwati. (2010). Pengelompokan Mahasiswa Menggunakan Algoritma K-Means. Jurnal Dinamika Informatika, Vol 2 No. 2.
Published
2020-06-30
How to Cite
muhajir, muhammad, & Sari, A. (2020). Implementasi Metode Improved K-Means dengan Algoritma Dbscan untuk Pengelompokan Film. UJMC (Unisda Journal of Mathematics and Computer Science), 6(01), 1-8. https://doi.org/https://doi.org/10.52166/ujmc.v6i01.1923