Duration: 2 Months
The main focus of Data Engineering is to create Data Pipelines that gather data from various sources and store it in Data Lakes or Data Warehouses, and then transfer it to downstream systems. In this course, you will learn how to build Data Engineering Pipelines using Azure Data Analytics Stack, which comprises various services such as Azure Storage (Blob and ADLS), ADF Data Flow, ADF Pipeline, Azure SQL, Azure Synapse, Azure Databricks, and many others.
During the course, you will be guided through the process of setting up the learning environment using VS Code on both Windows and Mac. Once the environment is ready, you will need to sign up for Azure Portal. We will provide step-by-step instructions on how to sign up for an Azure Portal Account, review billing, and receive a USD 200 Credit, which is valid for up to a month.
Azure Storage is commonly used as Data Lake. As part of this course, you will learn how to use Azure Storage as a Data Lake and how to manage files in Azure Storage using tools such as Azure Storage Explorer.
ADF is used for both ETL and Orchestration. First, you will learn how to perform ETL using ADF Data Flow. During this process, you will be working with Files in Azure Storage Account, and you will also learn how to set up Linked Services and Data Sets in ADF.
After preparing your ADF Data Flow, you can proceed with building an Orchestration Pipeline using ADF Pipeline. During this process, you will gain knowledge on parameterization and how to manage baseline loads effectively.
To optimize performance while using ADF Pipeline, you will learn key techniques such as controlling partition numbers, utilizing custom integration runtimes (IR), and more. Azure offers RDBMS services for Postgres, SQL Server, and other databases. By setting up Azure SQL, you will understand how to create tables and run queries against them.
To copy data from various sources and targets, ADF provides ADF Data Copy. You can use this tool to transfer data to your Database tables once they are set up. Additionally, Azure provides Synapse Analytics for Data Warehousing. You will gain an overview of serverless and dedicated pools and set up a Dedicated Pool for ETL using ADF.
After setting up Azure SQL and Azure Synapse, you can build an ETL Pipeline using ADF Data Flow and orchestrate it with ADF Pipeline. Azure Databricks is the Big Data Processing service that uses the Spark Engine. You will learn how to set up Azure Databricks, integrate it with ADLS, and manage secrets.
Using Azure Databricks, you will get an overview of Spark SQL and Pyspark Data Frame APIs. You will also build an ELT Pipeline using Databricks Jobs and Workflows, where tasks are defined based on Pyspark and Spark SQL.
Finally, you will understand how to build ADF Pipelines to orchestrate Databricks Notebooks. By following these steps, you will be able to effectively manage your data using Azure’s services and tools.
At Sriman IT, we offer expert-led Azure Data Factory course training with the following key features:
- 100% job-oriented training
- Industry expert faculties
- Free demo class available
- Certification guidance
Sriman IT’s experienced Azure Data Factory trainers have over 10 years of experience and have worked with major MNCs in Bangalore. Our trainers assess the proficiency level of beginners and design course objectives accordingly. Our institute has received several accolades, including Best Online Training (National/Regional) and Best Corporate Training Program (National/Regional). The instructors monitor the progress of learners and provide guidance on areas where they need improvement. They possess excellent organizational skills and can manage multiple schedules with dependability. Throughout the course, instructors may provide support to learners in obtaining a deep understanding of the key specialist subject areas. Our teachers are institutional leaders with a minimum of 10 years of experience who are committed to continuous learning and