Improving Large Language Modelâ€™s Ability to Find the Words Relationship

Sirojul Alam; Jaka Abdul Jabar; Fauzi Abdurrachman; Bambang Suharjo; H.A Danang Rimbawa

doi:10.30812/bite.v6i2.4127

Authors

Sirojul Alam Universitas Pertahanan Indonesia, Bogor, Indonesia
Jaka Abdul Jabar Universitas Pertahanan Indonesia, Bogor, Indonesia
Fauzi Abdurrachman Universitas Pertahanan Indonesia, Bogor, Indonesia
Bambang Suharjo Universitas Pertahanan Indonesia, Bogor, Indonesia
H.A Danang Rimbawa Universitas Pertahanan Indonesia, Bogor, Indonesia

DOI:

https://doi.org/10.30812/bite.v6i2.4127

Keywords:

Ability to Find, Large Language Modelâ€™s, Words Relationship

Abstract

Background: It is still possible to enhance the capabilities of popular and widely used large language models (LLMs) such as Generative Pre-trained Transformer (GPT). Using the Retrieval-Augmented Generation (RAG) architecture is one method of achieving enhancement. This architectural approach incorporates outside data into the model to improve LLM capabilities.

Objective: The aim of this research is to prove that the RAG can help LLMs respond with greater precision and rationale.

Method: The method used in this work is utilizing Huggingface Application Programming Interface (API) for word embedding, store and find the relationship of the words.

Result: The results show how well RAG performs, as the attractively rendered graph makes clear. The knowledge that has been obtained is logical and understandable, such as the word Logistic Regression that related to accuracy, F1 score, and defined as a simple and the best model compared to NaÃ¯ve Bayes and Support Vector Machine (SVM) model.

Conclusion: The conclusion is RAG helps LLMs to improve its capability well.

Downloads

Download data is not yet available.

References

<p align="justify">
[1] Z. Dai and J. Liu, â€œCommunotion and the Evolution of Human Language,â€ J. Arts Humanit., vol. 08, no. 09, pp. 100â€“110, 2019, [Online]. Available: https://www.theartsjournal.org/index.php/site/article/view/1737
[2] L. Damjanovic, S. G. B. Roberts, and A. I. Roberts, â€œLanguage as a tool for social bonding: evidence from wild chimpanzee gestural, vocal and bimodal signals,â€ Philos. Trans. R. Soc. B Biol. Sci., vol. 377, no. 1860, 2022, doi: 10.1098/rstb.2021.0311.
[3] A. M. Turing, â€œComputer Machinary and Intelligence,â€ pp. 433â€“460, 1950, doi: https://redirect.cs.umbc.edu/courses/471/papers/turing.pdf.
[4] J. S. Nixon and F. Tomaschek, â€œIntroduction to the special issue emergence of speech and language from prediction error: error-driven language models,â€ Lang. Cogn. Neurosci., vol. 38, no. 4, pp. 411â€“418, 2023, doi: 10.1080/23273798.2023.2197650.
[5] M. E. Peters et al., â€œDeep contextualized word representations,â€ arXiv Comput. Sci. - Cumputation Lang., Feb. 2018, [Online]. Available: http://arxiv.org/abs/1802.05365
[6] M. Shanahan, â€œTalking About Large Language Models,â€ arXiv Comput. Sci. - Cumputation Lang., no. December 2022, Dec. 2022, [Online]. Available: http://arxiv.org/abs/2212.03551
[7] J. Hoffmann et al., â€œTraining Compute-Optimal Large Language Models,â€ Adv. Neural Inf. Process. Syst., vol. 35, no. 2020, pp. 1â€“36, 2022.
[8] M. Lamparth and J. Schneider, â€œWhy the Military Can â€™ t Trust AI,â€ Foreign Aff., pp. 1â€“8, 2024.
[9] J. Ge et al., â€œDevelopment of a Liver Disease-Specific Large Language Model Chat Interface using Retrieval Augmented Generation,â€ medRxiv Prepr. Serv. Heal. Sci., Nov. 2023, doi: 10.1101/2023.11.10.23298364.
[10] K. Guu, K. Lee, Z. Tung, P. Pasupat, and M.-W. Chang, â€œREALM: Retrieval-Augmented Language Model Pre-Training,â€ 37th Int. Conf. Mach. Learn. ICML 2020, vol. PartF16814, pp. 3887â€“3896, Feb. 2020, [Online]. Available: http://arxiv.org/abs/2002.08909
[11] U. Khandelwal, O. Levy, D. Jurafsky, L. Zettlemoyer, and M. Lewis, â€œGeneralization Through Memorization: Nearest Neighbor Language Models,â€ 8th Int. Conf. Learn. Represent. ICLR 2020, 2020.
[12] W. Xu, X. Qian, M. Wang, L. Li, and W. Y. Wang, â€œSESCORE2: Learning Text Generation Evaluation via Synthesizing Realistic Mistakes,â€ Proc. Annu. Meet. Assoc. Comput. Linguist., vol. 1, pp. 5166â€“5183, 2023, doi: 10.18653/v1/2023.acl-long.283.
[13] C. Gutierrez and J. F. Sequeda, â€œKnowledge graphs,â€ Commun. ACM, vol. 64, no. 3, pp. 96â€“104, 2021, doi: 10.1145/3418294.
[14] M. Boudin, G. Diallo, M. DrancÃ©, and F. Mougin, â€œThe OREGANO knowledge graph for computational drug repurposing,â€ Sci. Data, vol. 10, no. 1, p. 871, Dec. 2023, doi: 10.1038/s41597-023-02757-0.
[15] V. K. Chaudhri et al., â€œKnowledge graphs: Introduction, history, and perspectives,â€ AI Mag., vol. 43, no. 1, pp. 17â€“29, 2022, doi: 10.1002/aaai.12033.
[16] N. Torabian, H. Radaei, B. Minaei-Bidgoli, and M. Jahanshahi, â€œEnhancing Knowledge graph with Selectional Preferences,â€ Res. Sq., pp. 1â€“23, Nov. 2023, doi: 10.21203/rs.3.rs-3620069/v1.
[17] R. Ludolph, A. Allam, and P. J. Schulz, â€œManipulating googleâ€™s knowledge graph box to counter biased information processing during an online search on vaccination: Application of a technological debiasing strategy,â€ J. Med. Internet Res., vol. 18, no. 6, 2016, doi: 10.2196/jmir.5430.
[18] G. Michelet and F. Breitinger, â€œChatGPT, Llama, can you write my report? An experiment on assisted digital forensics reports written using (local) large language models,â€ Forensic Sci. Int. Digit. Investig., vol. 48, no. March, 2024, doi: 10.1016/j.fsidi.2023.301683.
[19] K. I. Roumeliotis, N. D. Tselikas, and D. K. Nasiopoulos, â€œLLMs in e-commerce: A comparative analysis of GPT and LLaMA models in product review evaluation,â€ Nat. Lang. Process. J., vol. 6, no. November 2023, p. 100056, 2024, doi: 10.1016/j.nlp.2024.100056.
[20] L. Tunstall et al., â€œZephyr: Direct Distillation of LM Alignment,â€ pp. 1â€“14, 2023, [Online]. Available: http://arxiv.org/abs/2310.16944
[21] A. Q. Jiang et al., â€œMistral 7B,â€ Comput. Sci. - Comput. Lang., pp. 1â€“9, Oct. 2023, [Online]. Available: http://arxiv.org/abs/2310.06825
[22] P. Assiroj, A. Kurnia, and S. Alam, â€œThe performance of NaÃ¯ve Bayes, support vector machine, and logistic regression on Indonesia immigration sentiment analysis,â€ Bull. Electr. Eng. Informatics, vol. 12, no. 6, pp. 3843â€“3852, 2023, doi: 10.11591/eei.v12i6.5688.
[23] W. Chen, H. Hu, X. Chen, P. Verga, and W. W. Cohen, â€œMuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text,â€ Proc. 2022 Conf. Empir. Methods Nat. Lang. Process. EMNLP 2022, no. October, pp. 5558â€“5570, 2022, doi: 10.18653/v1/2022.emnlp-main.375.
</p>

Improving Large Language Modelâ€™s Ability to Find the Words Relationship

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

How to Cite

menu_utama

templateartikel

visitors

indexing

lokasi

submission

supervised

sitasi

gsw

Latest publications