Deteksi Ujaran Kebencian Menggunakan Metode Support Vector Machine (SVM)

Moh. Attar Jibran(1*), Ade Eviyanti(2), Yulian Findawati(3),

(1) Universitas Muhammadiyah Sidoarjo, Sidoarjo, Indonesia
(2) Universitas Muhammadiyah Sidoarjo, Sidoarjo, Indonesia
(3) Universitas Muhammadiyah Sidoarjo, Sidoarjo, Indonesia
(*) Corresponding Author

Abstract


Hate speech is a linguistic phenomenon that deviates from the norms and polite grammar in language and communication ethics, today hate speech is very widespread on the internet, especially social media users. This research is aimed at detecting a word or sentence containing or not containing a hate speech using the Support Vector Machine (SVM) method for classification. This research takes data from twitter tweets using the Tweepy API and gets a total sample data of 1681 labeled HS (643 data) for tweet data that is a hate speech and Non_HS (1038 data) for data that is not a hate speech. To do word weighting, researchers use Term Frequency-Inverse Document Frequency (TF-IDF) to find out the frequency of words that often arise in the dataset. In the classification process, researchers used two methods, namely Support Vector Machine and XGBoost which then from the best results in SVM with 90% training data and 10% test data obtained a training score of 95.87% and a test score of 87.30% with a gap of 8.57% then from the SVM method was tuned using Randomized Search Cross Validation (RSCV) and managed to increase the training score by 100% test score of 93.20% with a gap of 6.80%.

Full Text:

PDF

References


dan D. E. C. W. Dian Junita Ningrum, Suryadi, “KAJIAN UJARAN KEBENCIAN DI MEDIA SOSIAL,” Dian Junita Ningrum, Suryadi, dan Dian Eka Chandra Wardhana, vol. 2, no. 3, pp. 241–252, 2018.

F. A. S. Awaluddin, Afif Khalid, “Analisis Yuridis Tentang Pertanggungjawaban Pidana Pelaku Ujaran Kebencian (Hate Speech),” Univ. Islam Kalimantan, no. 19, pp. 1–14, 2022, [Online]. Available: http://eprints.uniska-bjm.ac.id/9294/.

M. K. Kelviandy, I. Komputer, and U. Gunadarma, “Kajian Penelitian Pembelajaran Mesin Untuk Pemrosesan Bahasa Alami Dalam Kalimat Perundungan Di Media Sosial,” vol. 03, no. 02, pp. 104–108, 2022.

I. Muslim Karo Karo, “Implementasi Metode XGBoost dan Feature Importance untuk Klasifikasi pada Kebakaran Hutan dan Lahan,” J. Softw. Eng. Inf. Commun. Technol., vol. 1, no. 1, pp. 11–18, 2020.

K. Akyol, “Coronary artery disease classification with support vector machines tuned via randomized search,” pp. 1–15, 2022.

W. Ayu, R. Abdulhakim, Y. Umaidah, and J. H. Jaman, “Optimasi Support Vector Machine Berbasis Particle Swarm Optimization Untuk Mendeteksi Hate Speech Pilkada Karawang,” J. Appl. Informatics Comput., vol. 5, no. 2, pp. 190–201, 2021, doi: 10.30871/jaic.v5i2.3473.

A. Adhari, M. Nasrun, and ..., “Deteksi Ujaran Ancaman Berbasis Website Pada Media Sosial Twitter Menggunakan Metode Support Vector Machine,” eProceedings …, vol. 8, no. 2, pp. 1920–1925, 2021, [Online].Available:https://openlibrarypublications.telkomuniversity.ac.id/index.php/engineering/article/viewFile/14602/14381.

D. P. N. Lyrawati, “Deteksi Ujaran Kebencian Pada Twitter Menjelang Pilpres 2019 Dengan Machine Learning,” J. Ilm. Mat., vol. 7, no. 2, pp. 104–110, 2019.

N. A. Verdikha, T. B. Adji, and A. E. Permanasari, “Komparasi Metode Oversampling Untuk Klasifikasi Teks Ujaran Kebencian,” Semin. Nas. Teknol. Inf. dan Multimed. 2018, pp. 85–90, 2018.

L. P. A. S. Tjahyanti, “Pendeteksian Bahasa Kasar (Abusive Language) Dan Ujaran Kebencian (Hate Speech) Dari Komentar Di Jejaring Sosial,” J. Chem. Inf. Model., vol. 07, no. 9, pp. 1689–1699, 2020.

J. A. Septian, T. M. Fachrudin, and A. Nugroho, “Analisis Sentimen Pengguna Twitter Terhadap Polemik Persepakbolaan Indonesia Menggunakan Pembobotan TF-IDF dan K-Nearest Neighbor,” J. Intell. Syst. Comput., vol. 1, no. 1, pp. 43–49, 2019, doi: 10.52985/insyst.v1i1.36.

M. Hasnain, M. F. Pasha, I. Ghani, M. Imran, M. Y. Alzahrani, and R. Budiarto, “Evaluating Trust Prediction and Confusion Matrix Measures for Web Services Ranking,” IEEE Access, vol. 8, pp. 90847–90861, 2020, doi: 10.1109/ACCESS.2020.2994222.




DOI: https://doi.org/10.30645/kesatria.v4i4.239

DOI (PDF): https://doi.org/10.30645/kesatria.v4i4.239.g237

Refbacks

  • There are currently no refbacks.


Published Papers Indexed/Abstracted By: