Research Collaboration

Independant Researcher, , 2022

Implemented statistical methods: re-sampling, cost-sensitive learning, and SMOTE for data with class-imbalance.
Performed data mining to curated ∼1M tweets in low resource Hindi language & conducted emoji prediction using bi-LSTM, mBERT, XLM-R, etc. (Published: EMNLP 2022)
Standardized 9 hate-speech datasets & implemented LSTM, BERT, RoBERTa, etc. (Published: EACL 2023)
Developed distributed FL architecture to obtain 14.52% improvement in F1-score while preserving privacy.