Badges
Certifications
Work Experience
Software Engineer
Airtel Africa•  January 2022 - Present
- Improved Kafka Sink Writer implemented to bolster Big Data life cycles, optimizing both data processing and storage efficacy. - Devised Big Data Deletion Life Cycle for Time Series Application framework, slashing storage expenses. - Utilized NiFi APIs for bulk processor property updates in NiFi. - Utilized Apache NiFi to architect streamlined data flows, connecting processors for exception handling, deduplication, and detailed logging in PostgreSQL to manage duplicate or erroneous data. - Incorporated regex patterns within NiFi to precisely route data to various locations, ensuring accuracy and efficiency in data movement. - Conducted seamless NiFi production upgrades. - Enhanced Hive's Dropping Partition functionality using Trino, amplifying query execution speed and refining resource utilization. - Collaborated with the data engineering team to construct an Apache Spark-driven data pipeline, handling Extensive amounts of data every day, delivering expedited insights to marketing and business units. - Performed performance refinement and optimization of Hive queries, curtailing query execution durations by 30%. - Introduced a data orchestration framework such as Apache Airflow to streamline and automate intricate ETL workflows, minimizing manual intervention and fortifying the reliability and resilience of data pipelines.
Education
NIT, Bhopal (Maulana Azad National Institute of Technology)
Computer Science & Engineering, B.Tech•  July 2018 - April 2022
Skills: C++, Java, Python, Linux, Shell, DBMS, Operating System, Computer Architectures, Big Data. Projects: I have developed several machine coding projects during my college, focusing on OOPs principles, maintainability, readability, test cases, and company coding standards. 1. QuantumFlow Databricks ETL Pipeline for Azure Data Lake: - Orchestrated the creation of an Azure Data Lake on Gen2, implementing Silver, Gold, and Platinum tiers for Raw, Preprocessed, and Final Report Data. - Streamlined data processing workflows by orchestrating Databricks Cluster, Pool, and Jobs. - Enhanced security using Azure Key Vault for credential management in Databricks' Azure Storage Integration. - Implemented Delta Lake for a resilient Lake House Architecture, accommodating Full Refresh and Incremental Load Patterns. - Leveraged Unity Catalog for robust data governance, encompassing Data Discovery, Audit, Lineage, and Access Control. - Developed a comprehensive Databricks notebook for encapsulating data processing and transformation. - Engineered end-to-end data pipelines, ensuring seamless execution of Databricks notebooks and robust handling of unexpected scenarios. - Implemented error handling and logging mechanisms, guaranteeing pipeline resilience. - Demonstrated advanced proficiency in cloud-based data solutions with a focus on scalability. - Upheld stringent data integrity and security standards throughout the entire project lifecycle. 2. Course Scheduling - Designed a course allotment solution for a Learning Management System. Check it out on GitHub: https://github.com/mutineleo/CourseScheduling. 3. LedgerCo - Created an application that calculates EMIs and tracks remaining amounts to pay. Find the project on GitHub: https://github.com/mutineleo/LedgerCo. 4. Splitwise - Developed a hassle-free expense tracking application for managing shared bills and expenses within a group. Explore the project on GitHub: https://github.com/mutineleo/SplitWise.