وب سایت تخصصی شرکت فرین
دسته بندی دوره ها

Master Data Engineering using Azure Data Analytics

سرفصل های دوره

Learn Azure Storage for Data Lake, ADF for ETL, BigQuery for Data Warehouse, Databricks for Big Data Pipeline, etcs


1. Introduction
  • 1. Introduction.html

  • 2. Setup Environment for Data Engineering using Azure
  • 1. Setup VS Code on Windows
  • 2. Setup Python 3.9 on Windows
  • 3. Configure Environment Variable PATH for Python on Windows
  • 4. Integrate VSCode with Python on Windows

  • 3. Getting Started with Azure for Data Engineering
  • 1. Sign up for Azure Portal
  • 2. Sign up for Azure Subscription
  • 3. Overview of Azure CLI and Azure Cloud Shell
  • 4. Setup Azure CLI on Windows or Mac or Linux
  • 5. Configure Azure CLI against Azure Portal Account
  • 6. Overview of Cost Management and Billing in Azure Portal
  • 7. Review Resources used by Azure Cloud Shell

  • 4. Getting Started with Azure Resource Groups
  • 1. Create Azure Resource Group using Azure Portal
  • 2. Add Storage Account as Resource to Azure Resource Group
  • 3. Overview of Azure Resource Groups and Resources

  • 5. Setup Data Sets for Data Engineering
  • 1. Download Data Sets for Data Engineering from Git Repository
  • 2. Create Container with in Azure Storage Account
  • 3. Review Upload Feature of Azure Storage Account using Azure Portal
  • 4. Setup Azure Storage Explorer on Windows or Mac
  • 5. Upload Local Folder into Azure Storage Container using Storage Explorer
  • 6. Validate Data Sets using Azure Portal
  • 7. Create ADLS Storage Account in Azure
  • 8. Upgrade Azure Blob Storage to ADLS Gen 2

  • 6. Getting Started with Azure Data Factory
  • 1. Introduction to Getting Started with Azure Data Factory
  • 2. Setup Azure Data Factory and Launch ADF Studio
  • 3. Overview of Azure Data Factory Studio
  • 4. Create ADF Linked Service to Storage Account
  • 5. Create ADF Dataset using ADF Studio
  • 6. Review ADF Dataset CSV Properties
  • 7. Create Azure Dataset for Sink using Parquet
  • 8. Understand the Schema of Data Set
  • 9. Create Data Flow Source using Azure Dataset
  • 10. Define Cache Sink to ADF Data Flow
  • 11. Create ADF Pipeline for File Format Converter
  • 12. Run and Review ADF Data Pipelines
  • 13. Update ADF Data Flow with ADLS Dataset as Sink
  • 14. Conclusion to Getting Started with Azure Data Factory
  • 15. Exercise - Simple ADF Data Flow and Pipeline for Order Items

  • 7. ADF Data Flow for ETL Logic to Compute Daily Product Revenue
  • 1. Introduction to ADF Data Flow for ETL Logic to Compute Daily Product Revenue
  • 2. Create Data Flow to Compute Daily Product Revenue
  • 3. Filter Transformation in ADF Data Flow
  • 4. Create ADF Pipeline to Validate Data Flow
  • 5. Create ADF Integration Runtime to run ADF Pipelines
  • 6. Validate Custom ADF Integration Runtime using ADF Pipeline
  • 7. ADF Data Flow Filter Transformation using in
  • 8. ADF Data Flow Join Trasformation between 2 Data Sets
  • 9. Validate ADF Data Flow Join Transformation using ADF Pipeline
  • 10. ADF Data Flow Aggregate Transformation to Compute Daily Product Revenue
  • 11. ADF Data Flow Sink to Save Results to Azure Storage using Parquet
  • 12. Run and Review ADF Pipeline with ETL Data Flow
  • 13. Access JSON Code of ADF Data Flow and Pipeline

  • 8. Run ADF Pipelines Dynamically using Parameters
  • 1. Introduction to Running ADF Pipelines Dynamically using Parameters
  • 2. Create ADF Data Set using Parameter for Dynamic Path
  • 3. Define Parameter and Use in Filter Transformation of ADF Data Flow
  • 4. Create ADF Pipeline with Parameter
  • 5. Run ADF Pipeline with Parameters

  • 9. Run Baseline ETL Loads using ADF Pipeline
  • 1. Overview of Common ADF Pipeline Activities
  • 2. Overview of ADF Pipeline ForEach
  • 3. Create ADF Pipeline for Baseline load using ForEach and Execute Pipeline
  • 4. Run ADF Pipeline for Baseline Load

  • 10. Performance Tuning of ADF Data Flows and Pipelines
  • 1. Introduction to Prerformance Tuning of ADF Data Flows and Pipelines
  • 2. Create Integration Runtime with right Compute Size
  • 3. Troubleshoot Performance Bottleneck of Baseline ADF Pipeline
  • 4. Reduce Cluster Startup Time using Custom Integration Runtime
  • 5. Using Paralllel in ADF Pipeline ForEach Activity
  • 6. Troubleshoot Shuffling and Too Many Small Files Issue
  • 7. Reduce Shuffle Partitions in ADF Data Flow Aggregate Transformation
  • 8. Conclusion of Performance Tuning of ADF Pipelines and Data Flows

  • 11. Getting Started with Azure SQL Database
  • 1. Setup Azure SQL Database Server
  • 2. Setup Database in Azure SQL Database Server
  • 3. Overview of SQL Server Databases in Azure
  • 4. Setup Azure Data Studio on Windows or Mac or Linux
  • 5. Connect to Azure SQL Database using Azure Data Studio

  • 12. ADF Data Copy to Copy Data From Files to SQL Server Tables
  • 1. Create table in Azure SQL Database
  • 2. Create Linked Service and Dataset for Azure SQL Database Table
  • 3. Copy ADF Dataset into a folder
  • 4. Create ADF Pipeline with Data Copy to Copy CSV Data to SQL Table
  • 5. Define Mapping in ADF Data Copy
  • 6. Merge from CSV to SQL Table using ADF Pipeline Data Copy
  • 7. Exercise to Copy Data to SQL Table using ADF Data Copy

  • 13. Getting Started with Azure Synapse Analytics
  • 1. Create Azure Synapse Analytics Workspace
  • 2. Getting Started with Azure Synapse Studio
  • 3. Overview of Azure Synapse Serverless SQL Pool
  • 4. Link Azure Storage Account with Azure Synapse Workspace
  • 5. Generate Azure Synapse Query using ADLS Files
  • 6. Run Queries against ADLS files using Azure Synapse Serverless Workspace
  • 7. Integrating Azure Synapse Workspace and Data in ADLS Files
  • 8. Specifying Schema for Azure Synapse Queries on ADLS Files
  • 9. Creating External Tables on ADLS Files using Azure Synapse Workspace
  • 10. Managing SQL Scripts in Azure Synapse Studio
  • 11. Create Dedicated SQL Pool in Azure Synapse Workspace
  • 12. Create Table and Copy Data into Azure Synapse Dedicated SQL Pool Database
  • 13. Overview of Development Tools in Azure Synapse Studio
  • 14. Copy Data into Azure Synapse Tables using Copy Data Tool
  • 15. Exercise to get started with Azure Synapse Analytics

  • 14. Build ADF Data Flow using Azure SQL and Synapse Analytics
  • 1. Introduction to ETL Logic and Application Architecture
  • 2. Overview of ETL using ADF Data Flow
  • 3. Getting Started with Azure Data Factory for the ETL Logic
  • 4. Review Linked Service and Datasets for Azure SQL Database and Tables
  • 5. Create ADF Data Flow with Azure SQL Database Table as Source
  • 6. Review ADF Data Flow Source Options for Database Table
  • 7. Create ADF Data Flow Source using Azure SQL Query
  • 8. Run ADF Pipeline with Source using Azure SQL Query
  • 9. Define and Use Parameters in ADF Data Flow Source with Azure SQL Query
  • 10. Run ADF Pipeline with SQL Query Source using Parameters
  • 11. ADF Data Flow Join Orders and Order Items
  • 12. Run ADF Data Flow with Join between Orders and Order Items
  • 13. Troubleshoot Issue in ADF Data Flow Source Query
  • 14. Run and Validate ADF Data Flow with fix
  • 15. Add Aggregate to ADF Data Flow
  • 16. Run ADF Pipeline to Compute Daily Product Revenue
  • 17. Create Target Table in Azure Synapse Dedicated SQL Pool
  • 18. Create Linked Service and Dataset for Azure Synapse Table
  • 19. Update and Run ADF Data Flow Sink with Synapse Dataset
  • 20. Pause or Delete Azure Synapse Dedicated SQL Pool

  • 15. Getting Started with Azure Databricks
  • 1. Create Azure Databricks Workspace using Premium Trial
  • 2. Launch Azure Databricks Workspace or Environment
  • 3. Getting Started with Databricks Clusters on Azure
  • 4. Increase Azure Quota for Databricks Clusters
  • 5. Getting Started with Databricks Notebook on Azure
  • 6. Overview of Azure Databricks Workspace Infrastructure
  • 7. Overview of Azure Databricks and other Azure Services
  • 8. Delete Azure Databricks Workspace
  • 9. Setup Databricks CLI on Windows
  • 10. Configure Databricks CLI and Validate
  • 11. Troubleshoot and Reconfigure Databricks CLI using Token

  • 16. Integration of Azure Storage and Databricks
  • 1. Introduction to Integration of Azure Storage and Databricks
  • 2. Setup Databricks Personal Compute Cluster
  • 3. Setup Data Set in DBFS using Databricks CLI Commands
  • 4. Overview of fs magic in Databricks Notebooks
  • 5. Access Files in Azure Storage Account using Credentials Passthrough
  • 6. Access Files in Azure Storage Account using Access Key
  • 7. Create Spark SQL View using Data in Azure Storage Accounts
  • 8. Validate using Spark SQL and Exercise to Create Spark SQL View
  • 9. Integration of Azure Storage and Pyspark Demo

  • 17. Overview of Databricks Secrets
  • 1. Introduction to Overview of Databricks Secrets
  • 2. Managing Databricks Secrets using Databricks CLI Commands
  • 3. Create Databricks Secret for Azure Storage Account Key
  • 4. Access Secret Details in Databricks Applications using dbutils secrets APIs
  • 5. Authenticate Application with Azure Storage Account using Databricks Secrets
  • 6. Steps involved in using Databricks Secrets

  • 18. Basic Transformations using Spark SQL
  • 1. Process Data in DBFS using Databricks Spark SQL
  • 2. Getting Started with Spark SQL Example using Databricks
  • 3. Create Temporary Views using Spark SQL
  • 4. Exercise to create temporary views using Spark SQL
  • 5. Spark SQL Query to compute Daily Product Revenue
  • 6. Save Query Result to DBFS using Spark SQL

  • 19. Ranking using Spark SQL Windowing Functions
  • 1. Ranking using Spark SQL Windowing Functions
  • 2. Create Temporary View for ranking using Spark SQL Windowing Functions
  • 3. Compute Global Rank using Spark SQL Windowing Functions
  • 4. Compute Ranks Per Key using Spark SQL Windowing Functions
  • 5. Difference Between rank and dense_rank
  • 6. Filter on Ranks using Spark SQL Windowing Functions

  • 20. Getting Started with PySpark Data Frame APIs
  • 1. Overview of Pyspark Examples on Databricks
  • 2. Process Schema Details in JSON using Pyspark
  • 3. Create Dataframe with Schema from JSON File using Pyspark
  • 4. Transform Data using Spark APIs
  • 5. Get Schema Details for all Data Sets using Pyspark
  • 6. Convert CSV to Parquet with Schema using Pyspark

  • 21. Databricks Jobs and Workflows
  • 1. Overview of Databricks Workflows
  • 2. Pass Arguments to Databricks Python Notebooks
  • 3. Pass Arguments to Databricks SQL Notebooks
  • 4. Create and Run First Databricks Job
  • 5. Run Databricks Jobs and Tasks with Parameters
  • 6. Create and Run Orchestrated Pipeline using Databricks Job
  • 7. Import ELT Data Pipeline Applications into Databricks Environment
  • 8. Spark SQL Application to Cleanup Database and Datasets
  • 9. Review File Format Converter Pyspark Code
  • 10. Review Databricks SQL Notebooks for Tables and Final Results
  • 11. Validate Applications for ELT Pipeline using Databricks
  • 12. Build ELT Pipeline using Databricks Job in Workflows
  • 13. Run and Review Execution details of ELT Data Pipeline using Databricks Job

  • 22. Build ELT Pipelines using Databricks Jobs and Workflows
  • 1. Overview of Databricks Workflows
  • 2. Pass Arguments to Databricks Python Notebooks
  • 3. Pass Arguments to Databricks SQL Notebooks
  • 4. Create and Run First Databricks Job
  • 5. Run Databricks Jobs and Tasks with Parameters
  • 6. Create and Run Orchestrated Pipeline using Databricks Job
  • 7. Import ELT Data Pipeline Applications into Databricks Environment
  • 8. Spark SQL Application to Cleanup Database and Datasets
  • 9. Review File Format Converter Pyspark Code
  • 10. Review Databricks SQL Notebooks for Tables and Final Results
  • 11. Validate Applications for ELT Pipeline using Databricks
  • 12. Build ELT Pipeline using Databricks Job in Workflows
  • 13. Run and Review Execution details of ELT Data Pipeline using Databricks Job
  • 14. Cleanup Databricks Environment on GCP

  • 23. Orchestrate Azure Databricks Applications using ADF Pipelines
  • 1. Getting Started with ADF to integrate with Databricks
  • 2. Create ADF Linked Service for Azure Databricks
  • 3. Trigger Azure Databricks Application from ADF Pipeline
  • 4. Develop Core Logic using Databricks Notebook for ADF Pipeline Integration
  • 5. Overview of Parameters using Databricks Notebooks
  • 6. Add Parameters to ADF Pipeline and Databricks Activity
  • 7. Run ADF Pipeline with Databricks Activity using Parameters

  • 24. Build Data Pipelines using ADF Pipelines and Databricks
  • 1. Introduction to Data Pipelines using ADF Pipelines and Databricks
  • 2. Code Review to Compute Daily Product Revenue
  • 3. Overview of ADF Pipeline and Databricks Integration Features
  • 4. Different Options for ADF Pipelines and Databricks Notebook Integration
  • 5. Create Driver Databricks Notebook for ADF Pipeline
  • 6. Create ADF Linked Service for Databricks Job Cluster
  • 7. Create ADF Pipeline with Databricks Driver Notebook
  • 8. Run ADF Pipeline with Databricks Driver Notebook
  • 9. ADF Pipeline to Orchestrate Databricks Notebooks
  • 10. ADF Pipeline to Orchestrate Databricks Notebooks
  • 11. Orchestrate Databricks Applications using ADF Pipeline
  • 12. Import ELT Data Pipeline Applications into Databricks Environment
  • 53,700 تومان
    بیش از یک محصول به صورت دانلودی میخواهید؟ محصول را به سبد خرید اضافه کنید.
    خرید دانلودی فوری

    در این روش نیاز به افزودن محصول به سبد خرید و تکمیل اطلاعات نیست و شما پس از وارد کردن ایمیل خود و طی کردن مراحل پرداخت لینک های دریافت محصولات را در ایمیل خود دریافت خواهید کرد.

    ایمیل شما:
    تولید کننده:
    شناسه: 7466
    حجم: 5287 مگابایت
    مدت زمان: 813 دقیقه
    تاریخ انتشار: 13 اسفند 1401
    طراحی سایت و خدمات سئو

    53,700 تومان
    افزودن به سبد خرید