Implementasi Metode Improved K-Means dengan Algoritma Dbscan untuk Pengelompokan Film
Abstract
The Indonesian film industry continues to experience an increase seen from the number of films that appear in theaters today with a box office increase of 28 percent each year in the past four years. Internet Movie Database (IMDb) is a website that provides information about films around the world, including the people involved in it from actors, directors, writers to makeup artists and soundtracks. In this case the researcher wants to conduct research on the characteristics of the film and the factors that make a film to be included in the IMDb Top 250. The data used in this study uses scraped data from the website. The method used is a non-hierarchical clustering method, namely kmeans and Dbscan. Where the Dbscan algorithm is used to determine the optimum number of clusters then proceed by grouping data based on centroids with k-means algorithm. From the analysis it was found that the factors that could influence a film included in the IMDB Top 250 were duration, number of votes, and films directed by Rajkumar Hirani and the optimal number of clusters using Dbscan algorithm obtained six clusters. With the improved k-means algorithm, the accuracy value for the cluster results is 87.2%.
References
[2] Hype Stat. (2020). Imdb.Com – Info. https://hypestat.com/info/imdb.com#info. Diakses pada 13 Februari 2020.
[3] Ibrahim, I. S. (2011). Budaya Populer sebagai Komunikasi; Dinamika Popscape dan. Yogyakarta: Jalasutra.
[4] Feldman, Ronen, Sanger, & dkk. (2007). The Text Mining Handbook Advanced Approaches in Analyzing Unstructured Data. New York: Cambridge University Press.
[5] Putra, A. A. (2016). Implementasi Text Summarization Menggunakan Metode Vector Space Model Pada Artikel Berita Berbahasa Indonesia.
[6] Fitri, Meisya. (2013). Perancangan Sistem Temu Balik Informasi Dengan Metode Pembobotan Kombinasi Tf-Idf Untuk Pencarian Dokumen Berbahasa Indonesia. Universitas Tanjungpura: Semarang.
[7] Supranto, J. (2004). Analisis Multivariat: Arti dan Interpretasi. Jakarta: PT. Rineka Cipta.
[8] Jolluffe, I. T. (2002). Principal Component Analysis 2nd Edition. New York: Springer-Verlag
[9] Triyanto, W. A. (2015). Algoritma K-Medoids untuk Penentuan Strategi. Jurnal SIMETRIS, Vol. 6 No.1 April 2015 183-188.
[10] Misra, S., Li, H., & He, J. (2019). Machine Learning for Subsurface Characterization. Cambridge: Elsevier Inc.
[11] Bari, A., Chaouchi, M., & Jung, T. (2014). Predictive Analytics For Dummies. New Jersey: John Wiley & Sons, Inc.
[12] Asroni, & Adrian, R. (2015). Penerapan Metode K-Means Untuk Clustering Mahasiswa Berdasarkan Nilai Akademik dengan Weka Interface Studi Kasus Pada Jurusan Teknik Informatika UMM Magelang. Jurnal Ilmiah Semesta Teknika, 78
[13] Guang-ping, C., & Wen-peng, W. (2012). Improved K-means Algorithm with Meliorated Initial Center. The 7th International Conference on Computer Science & Education, Volume 12, 150-153.
[14] Narwati. (2010). Pengelompokan Mahasiswa Menggunakan Algoritma K-Means. Jurnal Dinamika Informatika, Vol 2 No. 2.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish in UJMC (Unisda Journal of Mathematics and Computer Science) agree to the following terms:
1.Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
2.Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
3.Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.