Aniketh Johnson

Badges

Data Engineer Intern
Epro Infosystems• May 2021 - September 2021
• Developed a Java Spark pipeline in Azure ADF Batch to consume real time messages from EventHub and process files into ADLS • Utilized Azure Cache for Redis as cache to hold path of files in ADLS and performed script automation with bash scripts • Performed ETL process on data from Sources Systems (MSSQL) to Azure Data Storage services using Azure Data Factory, Spark SQL and Azure Data Lake • Orchestrated data pipeline using Azure Data Factory to incrementally ingest data from SQL Server to Azure Data Lake • Added Azure Databricks Notebook activity to ADF to process the data using Databricks Auto Loader and saved data to Delta Lake Tables • Deployed CI/CD pipelines using Azure DevOps for a cross functional team working on an ML project
Software Engineer
accenture• August 2018 - August 2018
• Aggregated complex and huge structured, semi structured and unstructured data in Apache Hadoop platform running on Spark execution engine • Responsible for importing and exporting data into HDFS and Hive tables leveraging Java Spark & Sqoop • Created external tables in Hive and loaded data into these tables • Importing and exporting data into HDFS from Oracle database and viceversa • Written HIVE queries for ad hoc data analysis to meet business requirements