Badges
Certifications
Work Experience
Data Engineer
Citizens Financial Group•  September 2020 - Present
* Devised and Developed ETL data pipelines to ingest data into AWS S3 from more than 300 sources (File, Database) using Talend. * Created python modules to migrate data from S3 to Redshift using AWS Glue and embedded it in the ETL pipeline. * Eliminated manual efforts of data pre-processing, schema validation completely by automating processes using python. * Daily status calls to discuss the status of designing and creating pipelines for new data sources, automations, crawlers, and RCA for any failures in ETL pipelines. * Used Autosys scheduler, to handle automatic pipeline execution process depending on data availability frequency.
Software Engineer Intern
Rapid Data LLC•  February 2020 - April 2020
Devised and Created resume parser which is a key part of Resumator web application, using Apache Tika, Natural Language Processing, Text Analysis and Named entity recognition. Apache Tika is used to extract data from different types of documents then spacy, nltk, text classification techniques are used to create structured JSON and store it in MongoDB. Extracted data can be used for multiple purposes like sharing the resume with recruiters, understanding the job market, converting resume into a different template. Creating API using Node.js, jwt authentication. The application deployed on Azure and tracked, developed using Azure DevOps(VSTS).
Software Engineer
accenture•  November 2016 - December 2018
Had outstanding exposure on troubleshooting and performance tuning of large-scale live production systems. Daily work involves performing Database maintenance activities like tablespace management, data pump, purging logs. To Improve databases performance, performed Tablespace / Schema Re-org activity, indexes recreation, re-writing SQL queries (making job to do index scan rather than full table scan). Code reviews with developers to write or suggest changes on SQL, PL\SQL, T-SQL code before pushing it to production. Created real-time monitoring system for all databases by monitoring logs, creating alerts and dashboard using SPLUNK. Optimized 2 Database Scan related jobs which improved their run time by 96% for both jobs by creating new indexes and rewriting SQL queries. Automated reporting process using shell and reduced run-time of jobs by 30% to generate and pre-process reports. Automated non-dependent users password change and reporting process using shell, python which is to improve database security and being compliant to client's security rules.
Education
University of North Carolina at Charlotte, Charlotte
Computer Science, MS•  January 2019 - Present
ACADEMIC PROJECTS: Event-Based Carpooling This Application allows users to offer/request carpooling rides for upcoming events shown which were pulled from free SeatGeek API. The client-server architecture used to develop this application along with the Agile delivery method where ReactJS acts as client/front-end and Python Flask acts as server/backend. Socialry Full-stack social networking application where users can create and join social events they are interested in. The application was designed and created using NodeJS, ExpressJS, MongoDB, HTML, CSS so that it can be supported in devices like laptop, mobile, tablet. Logging in, session management, dynamic data population over ExpressJS templates, and using async/await to handle promises for data retrieval are key features of the application. Gender Identification from Twitter posts data Using the Python requests package extracted data from Twitter API then loaded into a CSV file. Then performed Data pre-processing (removing stop words, stemming & lemmatization) and created the TF-IDF model with n-gram features. After that trained multiple classifiers (SVM, Naïve Bayes, Multinomial Naïve Bayes) and used K-fold validation to select better parameters achieved an accuracy of 79.42% using ensemble learning Google Play store app Analysis Multiple hypotheses were assumed by looking at the data then confirmed them using visualizations in Tableau also implemented pre-processing, dimension reduction using python. Using the K-means Clustering method identified most and least popular apps. Determined the rating of an app using Multiple Linear regression. Using the Regression tree’s R2 value (0.536), identified the most influencing factors to install the apps. Real-Time Facial Emotion Recognition Designed framework which will detect facial emotion of human being by using OpenCV DNN even though faces are inclined/tilted which was implemented using haar frontal face classifier earlier. The model predicted emotions with 75% accuracy. Mentor Management System Devised and Implemented MySQL database to store data for an application where new joiners in any corporate system interact with their mentors. Created objects like Clustered/Non-Clustered Indexes, Triggers, Stored Procedures, Functions, Events along with SQL queries with different joins to maintain low network traffic, maximum accessibility of data. Identify Influencers in Chats Used Socio-linguistic behaviors (Topic control indices, Emotive Language use, Measure of Argument Diversity, Network Centrality) to identify who is the most influencing person in the given conversation text. Performed Hypothesis testing on the pre-assumed hypothesis and found it to be correct by means of significance score.
Osmania University
Electrical Engineering, B.E•  September 2012 - July 2016
Links
Skills
Rakesh_Gunti has not updated skills details yet.