Paper accepted in NeurIPS 2023 at 3rd Workshop on Efficient Natural Language and Speech Processing
We are very glad to share that our paper titled:”HateXplain Space Model: Fusing Robustness with Explainability in Hate Speech Analysis” Accepted at “Efficient Natural Language and Speech Processing” workshop @NeurIPS 2023. The paper deals with detecting hate texts with more robust and explainable fashion.
LLMs demonstrate proficiency in various tasks but encounter difficulties when identifying hate contexts, especially in zero-shot or transfer learning scenarios. To tackle this challenge, we present Space Modeling (SM), an innovative approach that enhances hate context detection by generating word-level attribution and bias scores. These scores offer intuitive insights into model predictions and help recognize hateful terms.
Very glad to Dr. Ruhul Amin Sir (Assistant Professor, Fordham University, USA) for his supervision.