Sándor Zsombor: Parameter Efficient Fine-tuning of Largue Language Models

Önálló projekt, szakmai gyakorlat II

2024/25 I. félév

Témavezető:
Csanády Bálint Zsombor (ELTE TTK Számítógéptudományi tanszék, MI kutatócsoport)

LLMs such as the proprietary GPT-4 and the open-source Llama 2 present themselves as compelling solutions for large-scale data annotation in NLP. Indeed, minimal prompt-tuning enables them to be highly proficient in handling a wide variety of NLP tasks. However, running such LLMs on millions of prompts demands large and expensive computational resources. Previously, we focused on leveraging LLM’s language modeling capabilities on classification tasks involving millions of items, while utilizing relatively modest resources. This method, we called LamBERT, involved annotating a small subset of the corpus using Lama 2, and fine tuning a BERT model based on this annotation. The aim of the project is to assess PEFT techniques such as LoRA, prefix tuning, and P-tuning to potentially further increase the quality of data initially provided by the Lama 2 annotation.

e-mail (the above e-mail is incorrect): csbalint@cs.elte.hu