Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

About me

Posts

portfolio

Piper Chat

A chatbot application developed using MERN stack and Seq2Seq models in PyTorch.

publications

talks

teaching

Margosatree Technologies

Software Developer || January, 2020 - June, 2020.

  • Used Hadoop and Spark to detect 129 anomalies in 2TB of real-time data using the iForests algorithm.
  • Data analysis using PCA & multivariate linear modeling; rendered visualizations using Python & SQL.

Feople Org

Machine Learning Intern || July, 2020 - December, 2020.

  • Devised and implemented dynamic pricing strategy resulting in 28% increase in restaurant sales.
  • Deployed ML models in production through AWS EC2, Amazon Sagemaker endpoints, ML APIs, and Docker.
  • Worked on predictive analytics & AI modelling (Usage Forecast & Recommender System). Conducted analysis using statistics tools (A/B Testing).

Dwarkadas J. Sanghvi College of Engineering

Research Assistant || January, 2021 - June, 2021
Advisor: <a href="https://rammangrulkar.github.io" target=”_blank”>Dr. Ramchandra Mangrulkar</a>

  • Implemented 4 efficient aggregation strategies for federated learning on non-iid medical data, using ResNet & U-Net
  • Trained UMLFiT & AWD-LSTM models for detection of Spear Phishing on a corpus of ~73k emails.
  • Published 2 chapters with Dr. Ramchandra Mangrulkar in Chapman and Hall/CRC in the domain of FL \& NLP.

Indian Institute of Technology (IIT), Gandhinagar

Research Intern || May, 2021 - January, 2022
Advisor: <a href="https://mayank4490.github.io/" target=”_blank”>Dr. Mayank Singh</a>

  • Demystifying automated peer-review generators and evaluating their robustness to adversarial perturbations.
  • Formulated desiderata for an ideal review generator system and provided a public leaderboard along with a framework for unified & comprehensive measurement of their performance.
  • Won Best Presentation Award at SRIP program & participated in discussions at the LINGO (Computational Linguistics Group)

Research Collaboration

Independant Researcher || January, 2022 - April, 2023
Advisor: <a href="https://aclanthology.org/people/z/zeerak-talat/" target=”_blank”>Dr. Zeerak Talat</a>

  • Implemented statistical methods: re-sampling, cost-sensitive learning, and SMOTE for data with class-imbalance.
  • Performed data mining to curated ∼1M tweets in low resource Hindi language & conducted emoji prediction using bi-LSTM, mBERT, XLM-R, etc. (Published: EMNLP 2022)
  • Standardized 9 hate-speech datasets & implemented LSTM, BERT, RoBERTa, etc. (Published: EACL 2023)
  • Developed distributed FL architecture to obtain 14.52% improvement in F1-score while preserving privacy.

Unicode Research

Research Student || January, 2022 - Present
Advisor: <a href="https://unicode-research.netlify.app/people/" target=”_blank”>Swapneel Mehta, Dr. Akash Srivastava</a>

  • Served as TA for Google Research funded 9-week Machine Learning Course UMLSC 2021 with 100+ students.
  • Built data pipeline for mining Twitter data and managed workflows using Airflow, AWS EC2 & S3, and Docker.
  • Worked with the SimPPL team to build better civic integrity tools that support newsrooms to better understand their audiences on social networks. (supported by Wikimedia Foundation, Google, AWS, NYC Media Lab)

Cisco

Software Engineer II (Intern) || May, 2023 - August, 2023.

  • Creating tools for K8s clusters and CI/CD pipelines to enhance software supply chain security.