Machine Learning Engineer
Infocusp• April 2020 - July 2021
• Google X (September 2020 to July 2021) - Researched and analyzed ML experiments in a confidential NLP project in Neural Machine Translation. - Designed and implemented data pipeline using Apache Beam and Python Multiprocessing to fetch and pre-process ~20m unlabelled data points. - Improved the previous implementation using Kubernetes Engine and Pub/Sub service to fetch and pre-process ~5m labeled data points and sped up the process by ~10x. - Designed and implemented user interface using NextJS to demonstrate the product. - Implemented the backend server using Flask. Deployed and maintained the backend systems on Google Cloud. - Used Transformer of Tensorflow Official to train the model for different tasks. Researched and analyzed different ML experiments and built the state-of-the-art model for translating languages and achieved 80% accuracy for one of the models. • Innovyze (Autodesk) (April 2020 to July 2021) - Increased efficiency by 60% by automating the water distribution and chemical dosing in a plant. The process contains the following four phases. - Data Exploration - Explored the site data using different exploration techniques like time-series analysis, correlation analysis, outlier detection, and trend and seasonality analysis. - IRIS clustering - Clustered the data using K-means as well as HAC and verified each cluster using t-SNE. - Modeling - Trained ensemble models such as Extreme Gradient Boosting and Random Forest using XGBoost and Scikit-learn and achieved 90% accuracy. - Deployment - Used CICD pipeline to deploy the models on Azure server.
Infosys• June 2019 - March 2020
• Designed and implemented a Python-based tool for formatting raw log files into a more presentable HTML format and extracted the relevant information into a tabular format using Beautiful Soup. • Developed an end-to-end client-server e-commerce application using SpringBoot, Hibernate, and AngularJS from scratch. • Was a scrum master for a team of four and decreased latency by 10% by reducing the number of database queries.
Product Development Internship
Sprinklr• May 2018 - July 2018
Rutgers, The State University of New Jersey
Computer Science, MS• September 2021 - Present
DAIICT, Gandhinagar (Dhirubhai Ambani Institute of Information and Communication Technology)
Information Technology, B.Tech• August 2015 - May 2019