Model Penjawab Pertanyaan Otomatis Berdasarkan Peringkat Relevansi Kalimat Menggunakan Model BERT

Jati Sasongko Wibowo(1*), Herny Februariyanti(2), Hersatoto Listiyono(3),

(1) Universitas Stikubank, Indonesia
(2) Universitas Stikubank, Indonesia
(3) Universitas Stikubank, Indonesia
(*) Corresponding Author

Abstract


This research develops an Automatic Question Answering System based on sentence relevance ranking using BERT model and SQuAD dataset. Performance evaluation is done with F1 Score and Exact Match to assess the accuracy and precision of the answers. This research includes four main approaches: question understanding and keyword identification, sentence relevance ranking using techniques such as cosine similarity or TF-IDF score, use of BERT model to enrich text representation and understand the context in depth, and performance evaluation with F1 Score and Exact Match. The results show F1 Score value of 0.6 and Exact Match value of 0.5. The research objective is to develop a system that excels in answering questions with more accurate and contextualised sentence relevance. The main contribution of this research is the advancement in natural language processing (NLP) by integrating the BERT model, SQuAD dataset, and performance evaluation using rigorous metrics. The system is expected to improve users' access to information with more precise and contextualised answers.

Full Text:

PDF

References


J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” Oct. 2018.

A. R. Openai, K. N. Openai, T. S. Openai, and I. S. Openai, “Improving Language Understanding by Generative Pre-Training,” 2018. [Online]. Available: https://gluebenchmark.com/leaderboard

J. Lou, “BERT Squared: Read + Verify System for SQuAD 2.0,” 2019. [Online]. Available: https://github.com/huggingface/pytorch-pretrained-BERT.git

J. Li and Y. Zhang, “The Death of Feature Engineering? BERT with Linguistic Features on SQuAD 2.0,” arXiv preprint arXiv:2404.03184, 2024, [Online]. Available: https://arxiv.org/abs/2404.03184

Z. Hu, “Question answering on SQuAD with BERT,” CS224N Report, Stanford University. Accessed, 2019.

J. Lu, D. Batra, D. Parikh, and S. Lee, “ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks,” Aug. 2019.

P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, “SQuAD: 100,000+ Questions for Machine Comprehension of Text,” Jun. 2016.

M. Richardson, C. J. C. Burges, and E. Renshaw, “MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text,” 2013. [Online]. Available: http://www.mturk.com

D. A. Huang, T. K. Sethi, and E. J. Pugh, “SQuAD with SDNet and BERT.” 2019.

W. Zhou, X. Zhang, and H. Jiang, “Ensemble BERT with data augmentation and linguistic knowledge on SQuAD 2.0,” URL: https://pdfs. semanticscholar. org/2f99 …, 2019.

Z. Qin, W. Mao, and Z. Zhu, Diverse ensembling with bert and its variations for question answering on SQuAD 2.0. Stanford CS224N Final Project …, 2019.

W. Yih, X. He, and C. Meek, “Semantic Parsing for Single-Relation Question Answering,” in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Stroudsburg, PA, USA: Association for Computational Linguistics, 2014, pp. 643–648. doi: 10.3115/v1/P14-2105.

A. Bordes, Y.-L. Boureau, and J. Weston, “Learning End-to-End Goal-Oriented Dialog,” May 2016.

D. Chen, A. Fisch, J. Weston, and A. Bordes, “Reading Wikipedia to Answer Open-Domain Questions,” Mar. 2017.

C. Pan and L. Xu, Analyzing BERT with pre-train on SQuAD 2.0. Stanford Archive, 2019.

S. Gupta, “Exploring Neural Net Augmentation to BERT for Question Answering on SQUAD 2.0,” arXiv preprint arXiv:1908.01767, 2019, [Online]. Available: https://arxiv.org/abs/1908.01767




DOI: https://doi.org/10.30645/kesatria.v5i3.427

DOI (PDF): https://doi.org/10.30645/kesatria.v5i3.427.g423

Refbacks

  • There are currently no refbacks.


Published Papers Indexed/Abstracted By: