Evaluating Natural Language Understanding of LLMs with Turkish Idioms and Proverbs

Evaluating Natural Language Understanding of LLMs with Turkish Idioms and Proverbs

The project aims to assess the language understanding capabilities of large language models (LLMs) available for Turkish, encompassing both multilingual and monolingual models. A key component of this project involves creating a dataset that showcases scenarios where Turkish idioms and proverbs are contextually appropriate.

We will explore two approaches to develop this dataset:

  1. Synthetic Generation with LLMs: Creating contextually appropriate texts using LLMs.
  2. Manipulation of Existing Texts: Adapting relevant content from existing resources, including web corpora, books, and other literature.

The curated dataset will then serve as a foundation for designing tasks that evaluate LLMs' performance in Turkish through various scenarios, such as:

  • Open-ended Text Generation: Evaluating the models' ability to produce coherent and contextually appropriate responses.
  • Multiple Choice Question Answering: Assessing the models' comprehension and application of idioms and proverbs in specific contexts.

Relevant resource: https://aclanthology.org/2024.acl-long.279/

Project Advisor: 

Suzan Üsküdarlı

Project Status: 

Project Year: 

2024
  • Fall

Contact us

Department of Computer Engineering, Boğaziçi University,
34342 Bebek, Istanbul, Turkey

  • Phone: +90 212 359 45 23/24
  • Fax: +90 212 2872461
 

Connect with us

We're on Social Networks. Follow us & get in touch.