Crop Identification and Yield Estimation using SAR Data

Accurate crop identification and yield estimation are crucial for policymakers to develop effective agricultural policies, allocate resources efficiently, and support farmers in adopting suitable technologies. However, optical remote sensing methods, commonly used for crop identification and yield estimation, face challenges due to cloud cover and adverse weather conditions. King et al. (2013) [4] estimated that approximately 67 percent of the Earth’s surface is often obscured by clouds, making it difficult to obtain high-quality optical remote sensing data. Additionally, humid and semi-humid climate zones with abundant water sources pose further challenges for remote sensing in agriculture. To overcome these limitations, this project aims to utilize Synthetic Aperture Radar (SAR) data for crop identification and yield estimation. It enables continuous data collection regardless of light and weather conditions by using microwaves that can penetrate clouds. As SAR is sensitive to both the dielectric and geometrical characteristics of plants, it captures information below the vegetation canopy cover and provides insights into crop structure and health. Furthermore, SAR provides flexibility in imaging parameters such as incident angles and polarization configurations, facilitating the extraction of diverse information about agricultural landscapes.

Related Works:

  1. D. Suchi, A. Menon, A. Malik, J. Hu and J. Gao, Crop Identification Based on Remote Sensing Data using Machine Learning Approaches for Fresno County, California, 2021 IEEE Seventh International Conference on Big Data Computing Service and Applications (BigDataService), Oxford, United Kingdom, 2021, pp. 115-124, doi: 10.1109/BigDataService52369.2021.00019.
  2. Liu, C., Chen, Z., Shao, Y., Chen, J., Hasi, T., & Pan, H. (2019). Research advances of SAR remote sensing for agriculture applications: A review. Journal of Integrative Agriculture, 18(3), 506-525.
  3.   J. Singh, U. Devi, J. Hazra and S. Kalyanaraman, Crop-Identification Using Sentinel-1 and Sentinel-2 Data for Indian Region, IGARSS 2018 – 2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 2018, pp. 5312-5314, doi: 10.1109/IGARSS.2018.8517356.
  4.   King, M. D., Platnick, S., Menzel, W. P., Ackerman, S. A., & Hubanks, P. A. (2013). Spatial and Temporal Distribution of Clouds Observed by MODIS Onboard the Terra and Aqua Satellites. IEEE Transactions on Geoscience and Remote Sensing, 51(7), 3826–3852. doi:10.1109/tgrs.2012.2227333

Test-Time Domain Adaptation for Urban Categorization from Satellite Images

The urban environment is a complex system comprising various elements such as buildings, roads, vegetation, and water bodies. Classifying urban cities from satellite images or images captured by UAVs is an important task for urban planning, disaster management, and environmental monitoring. Urban environments differ across various cities of the world, and existing models for urban classification struggle to adapt to these changes. This project aims to develop an adaptive model for classifying urban cities from satellite images. This model will have the capability to generalize across different urban environments and adapt to the changing urban environment in real time. The model will be trained on a large dataset of satellite images of urban cities from different parts of the world. In inference or test time, the parameters of the model will be updated based on the changes in the different urban environments it has been deployed. This work will be built on recent work on Test time domain adaptation methods and our earlier research on the categorization of urban buildup [1], land usage, and land cover [2,3].

Related Works:

  1. Cheng, Q.; Zaber, M.; Rahman, A.M.; Zhang, H.; Guo, Z.; Okabe, A.; Shibasaki, R. Understanding the Urban Environment from Satellite Images with New Classification Method—Focusing on Formality and Informality. Sustainability 2022, 14, 4336. https://doi.org/10.3390/su14074336
  2. Rahman, A.K.M.M.; Zaber, M.; Cheng, Q.; Nayem, A.B.S.; Sarker, A.; Paul, O.; Shibasaki, R. Applying State-of-the-Art Deep-Learning Methods to Classify Urban Cities of the Developing World. Sensors 2021, 21, 7469. https://doi.org/10.3390/s21227469
  3. Niloy, Fahim Faisal, et al. “Attention toward neighbors: A context aware framework for high resolution image segmentation.” 2021 IEEE International Conference on Image Processing (ICIP). IEEE, 2021. 

Developing a Multi-Agent Framework for Multimodal Multi-Task Learning

This project is focused on enhancing the capabilities of large multimodal models. Multimodal learning is an area of machine learning where models are designed to process and correlate information from various input modalities, such as text, images, and audio. In this project, we are developing a multi-agent framework where each agent is specialized in understanding a specific modality and task. These agents work in tandem, the framework incorporates specific agents for the tasks they are specialized in dynamically, enabling the system to handle multiple tasks simultaneously. By integrating these multi-agent based ideas into large multi-modal models, our project aims to significantly improve performance in multi-task learning and generalization to new tasks.

Related publications:

  1. Large Multimodal Agents: A Survey
    Xie, J., Chen, Z., Zhang, R., Wan, X., & Li, G. (2024). Large Multimodal Agents: A Survey. arXiv:2402.15116. https://doi.org/10.48550/arXiv.2402.15116 
  2. AgentLite: ALightweightLibraryforBuildingandAdvancing Task-Oriented LLM Agent System
    Liu, Z., Yao, W., Zhang, J., Yang, L., Liu, Z., Tan, J., Choubey, P. K., Lan, T., Wu, J., Wang, H., Heinecke, S., Xiong, C., & Savarese, S. (2024). AgentLite: A Lightweight Library for Building and Advancing Task-Oriented LLM Agent System. arXiv:2402.15538. https://doi.org/10.48550/arXiv.2402.155381
  3. MuLan: Multimodal-LLM Agent for Progressive Multi-Object Diffusion
    Li, S., Wang, R., Hsieh, C.-J., Cheng, M., & Zhou, T. (2024). MuLan: Multimodal-LLM Agent for Progressive Multi-Object Diffusion. arXiv:2402.12741. https://doi.org/10.48550/arXiv.2402.12741

Non-Rigid Distortion Removal via Coordinate Based Image Representation

maging through turbulent refractive medium (e.g., hot air, in-homogeneous gas, fluid flow) is challenging, since the non-linear light transport through the medium (e.g. refraction and scattering) causes non-rigid distortions in perceived images. However, most computer vision algorithms rely on sharp and distortion-free images to achieve the expected performance. Removal of these non-rigid image distortions is therefore critical and beneficial for many vision applications, from segmentation to recognition. To resolve the distortion and blur introduced by air turbulence, conventional turbulence restoration methods leverage optical flow, regions fusion and blind deconvolution to recover images. One avenue that is underexplored for this problem is the use of coordinate based image representations. These methods represent images as the parameters of a neural network ,and they can be used to deform the image grid itself to account for turbulence. In this research, we aim to extend this idea to unseen images with meta learning that can remove both air and water distortions without much customization.

Related publications:

  1. Unsupervised Non-Rigid Image Distortion Removal via Grid Deformation, ICCV 2021

Adaptive LLM-based Tutor for Personalized Python Learning

Because of their varied backgrounds and skill levels, students in the field of programming education frequently confront a variety of difficulties. Personalized learning is typically not supported by traditional learning platforms, which reduces their efficacy. Our goal is to construct an intelligent tutor system based on LLMs that can solve problems and reason in order to provide students with tutor-like guidance. Additionally, we want to establish engaging interactions between students and tutors and during these exchanges, we would like to learn as much as possible about the tutors’ internal decision-making process. Furthermore, in order to deliver a more approachable and natural experience that is in line with the learner’s needs and the curriculum objectives, the system will need to recognize and monitor, as much as possible, the individual preferences and mental state of the learners. 

LLMs in the context of Code-Switching for Banglish Texts

In our increasingly interconnected global society, communication transcends linguistic boundaries, leading to a phenomenon known as code-switching. Code-switching refers to the practice of alternating between two or more languages or language varieties within a single discourse. In recent years, the advent of Language Models (LLMs) has revolutionized the way we interact with and understand languages. While LLMs perform quite well in monolingual queries such as question-answering, sentiment analysis and summarization, etc, their performance is downgraded in the scenario of code-switching. In this work, we are focusing on enhancing LLMs’ performance in the context of code-switching between Bangla and English.

Related publications

  1. Contextual Bangla Neural Stemmer: Finding Contextualized Root-Word Representations for Bangla Words”, 1st Workshop on Bangla Language Processing in conjunction with EMNLP, Association of Computational Linguistics, Singapore, Dec, 2023.
  2. Investigation the Effectiveness of Graph-based Algorithm for Bangla Text Classification, 1st Workshop on Bangla Language Processing in conjunction with EMNLP, Association of Computational Linguistics, Singapore, Dec, 2023.
  3. BaTEClaCor: A Novel Dataset for Bangla Text Error Classification and Correction, 1st Workshop on Bangla Language Processing in conjunction with EMNLP, Association of Computational Linguistics, Singapore, Dec, 2023.

Knowledge Graph and LLMs based QA System

The emergence of advanced large language models (LLMs), such as GPT-4 and LLaMa, marks a significant shift in information retrieval and Question Answering (QA) systems. Unlike traditional keyword-focused searches, these models can generate texts that are more intuitive and human-like. Trained on huge amounts of data, these models apparently “understand” the subtleties of language, context, and user intent.  However, LLMs have a few significant limitations – the models may “hallucinate”and they have limited domain knowledge, common sense etc.. Knowledge Graphs (KGs) can help overcome some of these challenges by providing a structured representation of domain knowledge. A KG is a database that stores information in the form of a graph, with nodes representing entities and edges representing relationships between them. KGs can enhance the reasoning ability of LLMs for QA systems by providing context, domain knowledge related to the questions. In this research, we focus on extracting the domain-specific knowledge sub-graph and enhancing its representation using graph neural networks for solving QA tasks with LLMs.  

Few-Shot Human Activity Recognition from Wearable Sensors

We stand at the forefront of transforming remote healthcare by pioneering sensor-based human activity recognition (HAR). Our primary objective is to develop state-of-the-art ML models specifically designed for deployment on remote devices, enabling the continuous monitoring of patients and elderly individuals who require ongoing support. A significant challenge in this endeavor is the scarcity of labeled data for various activity classes, making training of traditional models difficult. To address this, we are actively working on solving the few-shot learning problem, so that our models can adapt with minimal labeled examples. This work builds up on our work on Self-attention based HAR and assessment of rehabilitation exercises using sensor data.

Related publications

  1. Hierarchical Self Attention Based Autoencoder for Open-Set Human Activity Recognition, 25th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-2021), Springer, May 11-14, 2021, Delhi, India. [arxiv
  2. “Human Activity Recognition from Wearable Sensor Data using SelfAttention”, in the proceedings of 24th European Conference on Artificial Intelligence (ECAI), Spain, 2020. [pdf
  3. Assessment of Rehabilitation Exercises from Depth Sensor Data, International Conference on Computer and Information Technology (ICCIT), Dhaka, Bangladesh, December 18-20, 2021 [pdf
  4. An Integrated System for Stroke Rehabilitation Exercise Assessment using Kinect v2 and Machine Learning, International Conference on Intelligent Human Computer Interaction, Proceedings of LNCS, Springer, Nov, 2023. [link]

Exploring Relational Agents for Different Healthcare Applications

Relational agents (RAs) are a special type of computer program or virtual entity designed to interact with humans in a way that simulates social interactions. These agents are equipped with artificial intelligence (AI) and natural language processing capabilities, allowing them to engage in conversations, interpret emotions, and respond with empathetic and contextually appropriate behaviors. They play a pivotal role in human-computer interaction, particularly in fields like healthcare, where personalized and compassionate communication is crucial.

In this research, different aspects and applications of RAs are explored in the domain of healthcare services. Our earlier explorations target the efficacy, acceptance, usability, and other basic measurements regarding RAs for healthcare services, particularly during COVID-19. Currently, we are investigating future opportunities for employing RAs in diverse healthcare applications, including gestational diabetes, different epidemics, health education, etc. Moreover, we are also working toward achieving universal health coverage (UHC) in Bangladesh by utilizing RAs that have the capability of interacting using Bangla languages. These research works are exclusively and jointly conducted with Data and Design Nest at the University of Louisiana at Lafayette, USA. Outcomes of this initiative have been published in ACM UIST 2022, ACM HAI 2021, IEEE ISCC 2023, JMIR Human Factors, IJERPH, PervasiveHealth 2021, and DESRIST 2021.