LLMs in the context of Code-Switching for Banglish Texts

In our increasingly interconnected global society, communication transcends linguistic boundaries, leading to a phenomenon known as code-switching. Code-switching refers to the practice of alternating between two or more languages or language varieties within a single discourse. In recent years, the advent of Language Models (LLMs) has revolutionized the way we interact with and understand languages. While LLMs perform quite well in monolingual queries such as question-answering, sentiment analysis and summarization, etc, their performance is downgraded in the scenario of code-switching. In this work, we are focusing on enhancing LLMs’ performance in the context of code-switching between Bangla and English.

Related publications

  1. Contextual Bangla Neural Stemmer: Finding Contextualized Root-Word Representations for Bangla Words”, 1st Workshop on Bangla Language Processing in conjunction with EMNLP, Association of Computational Linguistics, Singapore, Dec, 2023.
  2. Investigation the Effectiveness of Graph-based Algorithm for Bangla Text Classification, 1st Workshop on Bangla Language Processing in conjunction with EMNLP, Association of Computational Linguistics, Singapore, Dec, 2023.
  3. BaTEClaCor: A Novel Dataset for Bangla Text Error Classification and Correction, 1st Workshop on Bangla Language Processing in conjunction with EMNLP, Association of Computational Linguistics, Singapore, Dec, 2023.