Senior Software Engineer
Unison Counsulting PTE LTD | August 2021 - Present
Client: OCBC, Singapore # T-ELT – Extract, Load and Transform Data from various platforms • Designed, developed and re-engineered batch framework pipelines in T-ELT framework which reduced 65% number of pipelines maintenance, 25% improved SLA in high availability datasets to business and it improved cluster performance and productivity. • Built custom python framework to extract data from various sources to load into Azure Data Lake using ADF Pipelines, transform and load datasets into Azure Blob Store for staging using HDInsight, Databricks, Azure Synapse & Streaming Analytics services. • Developed Spark scripts using Scala shell commands, applications using Python and Spark SQL for larger datasets processing • Migrated Hadoop and Spark workloads having complex transformations using Azure Databricks and integrated Azure Blob Store and HDFS based on performance. Developed dimension & fact datasets in SQL Datawarehouse for reporting team consumption. • Have done performance tuning of cluster including capacity planning for Spark jobs optimal consumption of resources based on data and processing needs, also optimized Hive queries are integrated into Spark environment using Spark SQL.
Senior Software Engineer
Optimum Solutions Pte LTD | September 2019 - August 2021
Client: DBS, Singapore #C2MA – Source Data Ingestion & Compute in ADA Platform #SAS Offload – Data Marts migration in ADA Compute Layer • Collaborate with business and cross functional teams in improvising Datamarts generation eradicated 30% duplication of datamarts, saved 40% cluster space which needs to be published monthly, quarterly and yearly reports for different LOB teams. • Developed ETL pipelines in ADA to migrate SAS data marts having more columns with map type transformation to ease the usage of marts, dynamic columns generations using source data and involved in reconciling the data end to end of SAS marts. • Built ETL framework to offload the SAS data platform to GCP using Spark/Hadoop in a seamless manner reconcile with legacy system end to end. Developed Streaming pipeline using Dataflow, Pub/Sub and Cloud functions on top of Apache Beam SDK • Involved in building custom python framework that extracts the data from source, transform the data and load into Google Big Query staging area to meet user specifications format into GCP using Data Proc, Pub/Sub and Data Flow. • Designed automated reconciliation scripts using Spark and Presto SQL engine between Teradata and Cloudera data platform. • Developed Qlik Sense dashboards with airflow scheduler data to monitor ingestion & compute jobs timely runs completion • Automated the deployments using Google Cloud Platform Services, YAML and Cloud deployment manger wherever required. Client: Credit Suisse, Singapore # Product Control Commentary dashboards for APAC Entities # Country Pack Deck creation for APAC region • Developed budget forecast, financial plan, variance reports, ad-hoc reports for Group Finance management review and analysis • Designed, developed applications in Spark using Scala to process the huge volumes of near real-time datasets thus various monthly, quarterly, yearly reports and dashboards generation time decreased by 50%, increased business productivity by 35%. • Analyzed & migrated spark jobs using Azure Data Factory pipelines to store data Azure Cosmos DB for further transformations. • Worked with Spark-Core RDD transformations, Data Frames on input data process, aggregations using Azure Databricks • Extensively worked on table partitioning strategy & storage level tuning of Parquet & ORC formats in Spark SQL, Hive tables. • Implemented Partitioning and Bucketing in Hive for more efficient data processing and integrated Hive tables with HBase.
JNTU, Hyderabad (Jawaharlal Nehru Technological University)
Computer Science & Engineering, B.Tech | September 2003 - April 2007