| Author name | Panagiotis Tamvakidis |
|---|---|
| Title | Argumentative sentence classification using transfer learning across languages |
| Year | 2020-2021 |
| Supervisor | George Petasis GeorgePetasis |
Transfer learning is one practice that is commonly being used for making machine learning tasks quicker and more successful. This practice can be also useful for text analysis and machine learning. "Argument mining" is one of the natural language processing tasks that "Transfer Learning" can be used. Most of the research and development for machine learning tasks happens in English language and this phenomenon can help for taking that kind of knowledge to use it for other languages in machine learning and deep learning tasks using "Transfer Learning". Transfer Learning practices is also going to be used in this work.
Making argument identification in sentences by applying transfer learning technics. A sentence is going to be argumentative when contains a claim or premise.The main idea is that the contextual embeddings which have been trained in English language are going to be aligned to the Greek model embeddings in order to make the predictions in Greek sentences. This technique is called Language Distillation [1] and in this related work has been used with a variety of embeddings. Parallel corpus dataset that contains sentences from source language (English) and target language (Greek) is the main weapon in order to make that kind of transfer learning.
Datasets that were used are the Essays corpus in the original and its translated form in Greek as well as the parallel sentences from TEDex 2020 talks. Data preparation was also one important step in order to transform the data into a sentence form with label of argumentative or not. Data augmentation practice was also used since volume of classes was imbalanced. The transformer based approach that took place in that thesis uses BERT [3], SBERT [4] and XLM-Roberta [5] models in relation of a deep learning model in order to produce the final prediction.