Evaluasi Responsivitas dan Akurasi: Perbandingan Kinerja ChatGPT dan Google BARD dalam Menjawab Pertanyaan seputar Python

Yayan Heryanto(1*), F Fauziah(2), Frenda Farahdinna(3), Sigit Wijanarko(4),

(1) Universitas Nasional, Indonesia
(2) Universitas Nasional, Indonesia
(3) Universitas Nasional, Indonesia
(4) Universitas Nasional, Indonesia
(*) Corresponding Author

Abstract


This reseach aims to evaluate the responsiveness and accuracy of two natural language processing systems, namely ChatGPT and Google BARD, in answering questions related to the Python programming language. The evaluation is conducted using the Bleu Score metric as an indicator of the accuracy of answers generated by both systems. This research involves experiments with various Python-related questions to measure the level of alignment with expected reference answers. The results indicate that the average Bleu Score for ChatGPT is 0.0088, while the average Bleu Score for Google BARD is 0.0073. Additionally, the response time for ChatGPT is recorded at 12.05 seconds, whereas Google BARD has a response time of 18.38 seconds. Although there is a small difference in accuracy, ChatGPT shows a slightly higher Bleu Score and faster response time compared to Google BARD. The conclusion of this research states that, in the context of answering questions related to the Python programming language, ChatGPT performs slightly better than Google BARD, measured in terms of answer accuracy and response time.

Full Text:

PDF

References


Sing, S. K., Kumar, S., Mehra, P. S., “Chat GPT & Google Bard AI: A Review”, IEEE 2023 International Conference on IoT, Communication and Automation Technology (ICICAT), ISBN 979-8-3503-0282-0.

Haau-Sing (Xiaocheng) Li, Mohsen Mesgar, André Martins, Iryna Gurevych, “Python Code Generation by Asking Clarification Questions”, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 14287–14306.

Adamson, V., Bägerfeldt, J., “Assessing the effectiveness of ChatGPT in generating Python code”, Independent thesis Basic level (degree of Bachelor), 20 credits / 30 HE credits, pp 38, 2023.

Avisha Das, Rakesh M. Verma, “Can Machines Tell Stories? A Comparative Study of Deep Neural Language Models and Metrics”, IEEE Access (Volume: 8), pp 181258 – 181292, September 2020.

Feriel Khennouche, Youssef Elmir, Yassine Himeur, Nabil Djebari, Abbes Amira, “Revolutionizing generative pre-traineds: Insights and challenges in deploying ChatGPT and generative chatbots for FAQs”, ScienceDirect Expert Systems with Applications, pp 246, July 2024.

Anup Kumar D. Dhanvijay, Mohammed Jaffer Pinjar, Nitin Dhokane, Smita R. Sorte, Amita Kumari, Himel Monda, "Performance of Large Language Models (ChatGPT, Bing Search, and Google Bard) in Solving Case Vignettes in Physiology", DOI: 10.7759/cureus.42972, April 2023.

Muhammad Usman Hadi, qasem al tashi, Rizwan Qureshi, Abbas Shah, amgad muneer, Muhammad Irfan, Anas Zafar, Muhammad Bilal Shaikh, Naveed Akhtar, Jia Wu, Seyedali Mirjalili, Mubarak Shah., "Large Language Models: A Comprehensive Survey of its Applications, Challenges, Limitations, and Future Prospects", TechRxiv, DOI: 10.36227/techrxiv.23589741.v4, November 2023.

Dodi Setiawan, Emilia Ayu Dewi Karuniawati, Saksia Imelda Janty, “Peran Chat Gpt (Generative Pre-Training Transformer) Dalam Implementasi Ditinjau Dari Dataset”, INNOVATIVE: Journal Of Social Science Research, pp 9527-9539, 2023.

Imtiaz Ahmed, Ayon Roy, Mashrafi Kajol, Uzma Hasan, Partha Protim Datta, Md. Rokonuzzaman Reza, "ChatGPT vs. Bard: A Comparative Study”, Authorea DOI: 10.22541/au.168923529.98827844/v1, July 2023.

Ethan Waisberg, Joshua Ong, Mouayad Masalkhi, Nasif Zaman, Prithul Sarker, Andrew G. Lee, Alireza Tavakkoli, “Google’s AI chatbot “Bard”: a side-by-side comparison with ChatGPT and its utilization in ophthalmology”, DOI: 10.1038/s41433-023-02760-0, September 2023.

Jurgen Rudolph, Shannon Tan, Samson Tan, “War of the chatbots: Bard, Bing Chat, ChatGPT, Ernie and beyond. The new AI gold rush and its impact on higher education”, Journal of Applied Learning & Teaching Vol.6 No.1, DOI: 0.37074/jalt.2023.6.1.23, 2023.

Trautman, Lawrence J. and Voss, W. Gregory and Shackelford, Scott J., “How We Learned to Stop Worrying and Love AI: Analyzing the Rapid Evolution of Generative Pre-Trained Transformer (GPT) and its Impacts on Law”, Business, and Society (July 20, 2023). Available at SSRN: https://ssrn.com/abstract=4516154 or http://dx.doi.org/10.2139/ssrn.4516154.

Andreas Dengel, Rupert Gehrlein, David Fernes, Sebastian Görlich, Jonas Maurer, Hai Hoang Pham, Gabriel Großmann and Niklas Dietrich genannt Eisermann, “Qualitative Research Methods for Large Language Models: Conducting Semi-Structured Interviews with ChatGPT and BARD on Computer Science Education”, Informatics, DOI: 10.3390/informatics10040078, October 2023.

Guru99, https://www.guru99.com/pdf/python-interview-questions-answers.pdf, diakses Januari 2024.

K. R. Srinath, “Python – The Fastest Growing Programming Language”, International Research Journal of Engineering and Technology (IRJET), Volume: 04 Issue: 12, Desember 2017.

F.Noorbehbahani, A.A. Kardan, “The automatic assessment of free text answers using a modified BLEU algorithm”, Computers & Education, Volume 56, Issue 2, pp 337-345, Februari 2011.

Y. Heryanto, A. Triayudi., "Evaluating Text Quality of GPT Engine Davinci-003 and GPT Engine Davinci Generation Using BLEU Score", SAGA: Journal of Technology and Information Systems , pp 121-129, November 2023.

Ishith Seth, Bryan Lim, Yi Xie, Jevan Cevik, Warren M. Rozen, Richard J. Ross, Mathew Lee, "Comparing the Efficacy of Large Language Models ChatGPT, BARD, and Bing AI in Providing Information on Rhinoplasty: An Observational Study", Aesthetic Surgery Journal Open Forum, Volume 5, 2023, September 2023.

Peiyi Wang, Lei Li, Liang Chen, Feifan Song, Binghuai Lin, Yunbo Cao, Tianyu Liu, Zhifang Sui, "Making Large Language Models Better Reasoners with Alignment", DOI: 10.48550/arXiv.2309.02144, September 2023.

Vagelis Plevris, George Papazafeiropoulos, George Papazafeiropoulos, "Chatbots Put to the Test in Math and Logic Problems: A Comparison and Assessment of ChatGPT-3.5, ChatGPT-4, and Google Bard", AI 2023, 4(4), pp 949-969; DOI: 10.3390/ai4040048, October 2023.




DOI: http://dx.doi.org/10.30645/jurasik.v9i1.731

DOI (PDF): http://dx.doi.org/10.30645/jurasik.v9i1.731.g706

Refbacks

  • There are currently no refbacks.



JURASIK (Jurnal Riset Sistem Informasi dan Teknik Informatika)
Published Papers Indexed/Abstracted By:

Jumlah Kunjungan : View My Stats