وب سایت تخصصی شرکت فرین
دسته بندی دوره ها

Basics to Advanced: Azure Synapse Analytics Hands-On Project

سرفصل های دوره

Build complete project only with Azure Synapse Analytics focused on PySpark includes delta lake and spark Optimizations


1. Introduction
  • 1. Introduction
  • 2. Project Architecture
  • 3.1 Synapse Project Deck.pdf
  • 3. Course Slides.html

  • 2. Origin of Azure Synapse Analytics
  • 1. Section Introduction
  • 2. Need of separate Analytical system
  • 3. OLAP vs OLTP
  • 4. A typical Datawarehouse
  • 5. Datalake Introduction
  • 6. Modern datawarehouse and its problem
  • 7. The solution - Azure Synapse Analytics and its Components
  • 8. Azure Synapse Analytics - A Single stop solution
  • 9. Section Summary

  • 3. Environment Setup
  • 1. Section Introduction
  • 2. Creating a resource group in Azure
  • 3. Create Azure Synapse Analytics Service
  • 4. Exploring Azure Synapse Analytics
  • 5. Understanding the dataset

  • 4. Serverless SQL Pool
  • 1. Section Introduction
  • 2. Serverless SQL Pool - Introduction
  • 3. Serverless SQL Pool - Architecture
  • 4. Serverless SQL Pool- Benefits and Pricing
  • 5.1 Unemployment.csv
  • 5.2 unemployment.zip
  • 5. Uploading files into Azure Datalake Storage
  • 6.1 1 data exploration.zip
  • 6.2 Openrowset.html
  • 6. Initial Data Exploration
  • 7. How to import SQL scripts or ipynb notebooks to Azure Synapse
  • 8.1 2 fixing collation warning.zip
  • 8. Fixing the Collation warning
  • 9.1 3 creating external datasource.zip
  • 9. Creating External datasource
  • 10.1 4 creating database scoped credential sas.zip
  • 10. Creating database scoped credential Using SAS
  • 11.1 5 creating database scoped credential mi.zip
  • 11. Creating Database scoped cred using MI
  • 12. Deleting existing data sources for cleanup
  • 13. Creating an external file format - Demo
  • 14.1 6 create external file format.zip
  • 14. Creating an External File Format - Practical
  • 15. Creating External DataSource for Refined container
  • 16.1 7 creating external table.zip
  • 16. Creating an External Table
  • 17. End of section

  • 5. History and Data processing before Spark
  • 1. Section Introduction
  • 2. Big Data Approach
  • 3. Understanding Hadoop Yarn- Cluster Manager
  • 4. Understanding Hadoop - HDFS
  • 5. Understanding Hadoop - MapReduce Distributed Computing

  • 6. Emergence of Spark
  • 1. Section Introduction
  • 2. Drawbacks of MapReduce Framework
  • 3. Emergence of Spark

  • 7. Spark Core Concepts
  • 1. Section Introduction
  • 2. Spark EcoSystem
  • 3. Difference between Hadoop & Spark
  • 4. Spark Architecture
  • 5. Creating a Spark Pool & its benefits
  • 6. RDD Overview
  • 7. Functions Lambda, Map and Filter - Overview
  • 8.1 10 understanding rdd in practical.zip
  • 8. Understanding RDD in practical
  • 9. RDD- Lazy loading - Transformations and Actions
  • 10. What is RDD Lineage
  • 11. RDD - Word count program - Demo
  • 12.1 14 word count pyspark program practical.zip
  • 12.2 tonystark.txt
  • 12. RDD - Word count - PySpark Program - Practical
  • 13. Optimization - ReduceByKey vs GroupByKey Explanation
  • 14. RDD - Understanding about Jobs in spark Practical
  • 15. RDD - Understanding Narrow and Wide Transformations
  • 16. RDD- Understanding Stages - Practical
  • 17.1 18 rdd understanding tasks practical.zip
  • 17. RDD- Understanding Tasks Practical
  • 18. Understand DAG , RDD Lineage and Differences
  • 19. Spark Higher level APIs Intro
  • 20.1 2023-01-15 213417.413947.csv
  • 20.2 2023-01-15 213417.413947.zip
  • 20.3 2023-01-15 213417.413947.zip
  • 20.4 dataframe practical.zip
  • 20. Synapse Notebook - Creating dataframes practical

  • 8. PySpark Transformation 1 - Select and Filter functions
  • 1. Introduction for PySpark Transformations
  • 2.1 1 walkthough on notebook.zip
  • 2. Walkthrough on Notebook , Markdown cells
  • 3.1 Databricks login.html
  • 3.2 Databricks Signup.html
  • 3. Using Free Databricks Community Edition to practise and Save Costs
  • 4.1 2 display and show functions.zip
  • 4. Display and show Functions
  • 5. Stop Spark Session when not in use
  • 6.1 3 select and selectexpr.zip
  • 6. Select and SelectExpr
  • 7.1 4 filter function.zip
  • 7. Filter Function
  • 8. Organizing notebooks into a folder

  • 9. PySpark Transformation 2 - Handling Nulls, Duplicates and aggregation
  • 1.1 1 understanding fillna and nadotfill.zip
  • 1. Understanding fillna and na.fill
  • 2.1 2 handling duplicates and dropna.zip
  • 2. Identifying duplicates using Aggregations
  • 3.1 2 handling duplicates and dropna.zip
  • 3. Handling Duplicates using dropna
  • 4. Organising notebooks into a folder
  • 5. Transformations summary of this section

  • 10. PySpark Transformation 3 - Data Transformation and Manipulation
  • 1.1 3 data transformation and manipulation.zip
  • 1. withColumn to Create Update columns
  • 2.1 3 data transformation and manipulation.zip
  • 2. Transforming and updating column withColumnRenamed

  • 11. PySpark 4 - Synapse Spark - MSSparkUtils
  • 1. What is MSSpark Utilities
  • 2.1 1 mssparkutils env.zip
  • 2. MSSpark Utils - Env utils
  • 3. What is mount point
  • 4.1 2 msspark utils fs mount.zip
  • 4. Creating and accessing mount point in Notebook
  • 5.1 3 msspark utils fs utils.zip
  • 5. All File System Utils
  • 6.1 4 a notebook parent.zip
  • 6. Notebook Utils - Exit command
  • 7.1 Synapse Quotas.html
  • 7. Creating another spark pool
  • 8.1 To Submit ticket for quota increase.html
  • 8. Procedure to increase vCores request (optional)
  • 9.1 4 a notebook child.zip
  • 9.2 4 a notebook parent.zip
  • 9. Calling notebook from another notebook
  • 10.1 4 a notebook parent para.zip
  • 10. Calling notebook from another using runtime parameters
  • 11.1 5 magic commands.zip
  • 11. Magic commands
  • 12.1 FAQ.html
  • 12. Attaching two notebooks to a single spark pool
  • 13.1 6 1 accessing mount configuration.zip
  • 13.2 6 mount configuration.zip
  • 13. Accessing Mount points from another notebook

  • 12. PySpark 5 - Synapse - Spark SQL
  • 1.1 1 accessing data using temporary views practical.zip
  • 1. Accessing data using Temporary Views - Practical
  • 2. Lake Database - Overview
  • 3.1 2 creating database in lake database.zip
  • 3. Understanding and creating database in Lake Database
  • 4.1 2 creating database in lake database.zip
  • 4. Using Spark SQL in notebook
  • 5.1 3 managed vs external tables.zip
  • 5. Managed vs External tables in Spark
  • 6. Metadata sharing between Spark pool and Serverless SQL Pool
  • 7. Deleting unwanted folders

  • 13. PySpark Transformation 6 - Join Transformations
  • 1.1 Education and Expected Salary ranges.csv
  • 1.2 Education Details.csv
  • 1.3 Salary Details.csv
  • 1. Uploading required files for Joins
  • 2.1 1 understanding joins and union.zip
  • 2. Python notebooks till Union.html
  • 3. Inner join
  • 4. Left Join
  • 5. Right Join
  • 6. Full outer join
  • 7. Left Semi Join
  • 8. Left anti and Cross Join
  • 9. Union Operation
  • 10.1 2 performing join transformation.zip
  • 10. Performing Join Transformation on Project Dataset
  • 11. Summary of Transformations performed

  • 14. PySpark Transformation 7 - String Manipulation and sorting
  • 1. Replace function to change spaces
  • 2.1 1 string manipulation and sorting.zip
  • 2. PySpark Notebook for this section.html
  • 3. Split and concat functions
  • 4. Order by and sort
  • 5. Section Summary

  • 15. PySpark Transformation 8 - Window Functions
  • 1. Row number function
  • 2.1 1 window functions.zip
  • 2. PySpark Notebook used in this section.html
  • 3. Rank Function
  • 4. Dense Rank function

  • 16. PySpark Transformation 9 - Conversions and Pivoting
  • 1. Conversion using cast function
  • 2.1 1 cast and pivoting.zip
  • 2. PySpark Notebook need for casting and pivoting lectures.html
  • 3. Pivot function
  • 4. Unpivot using stack function
  • 5.1 2 to date+function.zip
  • 5.2 Databricks - Datetime Patterns.html
  • 5.3 Microsoft Docs - Date time patterns.html
  • 5.4 Microsoft Docs - Datetime.html
  • 5. Using to date to convert date column

  • 17. PySpark Transformation 10 - Schema definition and Management
  • 1.1 1 schema definition and management.zip
  • 1. PySpark Notebook used in this lecture.html
  • 2. StructType and StructField - Demo
  • 3. Implementing explicit schema with StructType and StructField

  • 18. PySpark Transformation 11 - UDFs
  • 1. User Defined Functions - Demo
  • 2.1 1 udfs.zip
  • 2. Implementing UDFs in Notebook
  • 3.1 1 writing data to processed container.zip
  • 3. Writing transformed data to Processed container

  • 19. Dedicated SQL Pool
  • 1. Dedicated SQL pool - Demo
  • 2. Dedicated SQL Pool Architecture
  • 3. How distribution takes places based on DWU
  • 4. Factors to consider when choosing dedicated SQL pool
  • 5. Creating Dedicated SQL pool in Synapse
  • 6. Ways to copy data into Dedicated SQL Pool
  • 7.1 1 copy command to get data into dedicated sql pool.zip
  • 7. Copy command to copy to dedicated SQL pool
  • 8. Clustured Column Store index(optional)
  • 9. Types of Distributions or Sharing patterns
  • 10. Using Pipeline to Copy to dedicated SQL Pool

  • 20. Reporting data to Power BI
  • 1. Section Introduction
  • 2. Installing Power BI Desktop
  • 3. Creating report from Power BI Desktop
  • 4. Creating new user in Azure AD for creating workspace (if using personal account)
  • 5. Creating a shared workspace in Power BI
  • 6. Publishing report to Shared Workspace
  • 7. Accessing Power BI from Azure Synapse Analytics
  • 8.1 synapse power bi report.zip
  • 8. Download Power BI .pbix file from here.html
  • 9. Creating Dataset and report from Synapse Analytics
  • 10. Concluding the Power BI Section
  • 11. Summary and end of project implementation

  • 21. Spark - Optimisation Techniques
  • 1. Optimisation Section Intro
  • 2.1 cache.csv
  • 2.2 partition.zip
  • 2.3 Unemployment collect.csv
  • 2.4 Unemployment inferschema.csv
  • 2. Uploading required files for Optimisation
  • 3. Spark Optimisation levels
  • 4.1 1 optimization avoid collect.zip
  • 4. Avoid using Collect function
  • 5. Making notebook into particular folder
  • 6.1 2 avoid infer schema.zip
  • 6. Avoid InferSchema
  • 7. Use Cache Persist 1 - Understanding Serialization and DeSerialization
  • 8. Use Cache Persist 2 - How cache or persist will work - Demo
  • 9.1 3 cache.zip
  • 9. Use Cache Persist 3 - Understanding cache practically
  • 10. Use Cache Persist 4 - Persist - What is persist and different storage levels
  • 11.1 4 persist.zip
  • 11.2 storage level notes.zip
  • 11. Use Cache Persist - Notebook for persist with all storage levels.html
  • 12. Use Cache Persist 5 - Persist - MEMORY ONLY
  • 13. Use Cache Persist 6 - Persist - MEMORY AND DISK
  • 14. Use Cache Persist 7 - Persist - MEMORY ONLY SER (Scala Only)
  • 15. Use Cache Persist 8 - Persist - MEMORY AND DISK SER ( Scala Only)
  • 16. Use Cache Persist 9 - Persist - DISK ONLY
  • 17. Use Cache Persist 10 - Persist - OFF HEAP (Scala Only)
  • 18. Use Cache Persist 11 - Persist - MEMORY ONLY 2 (PySpark only)
  • 19. Use Partitioning 1 - Understanding partitioning - Demo
  • 20.1 4 paritioning.zip
  • 20. Use Partitioning 2 - Understand partitioning - Practical
  • 21. Repartiton and coalesce 1 - Understanding repartition and coalesce - Demo
  • 22. Repartiton and coalesce 2 - Understanding repartition and coalesce - Practical
  • 23. Broadcast variables 1 - Understanding broadcast variables - Demo
  • 24.1 6 broadcast variables.zip
  • 24. Broadcast variables 2 - Implementing broadcast variables in notebook
  • 25. Use Kryo Serializer

  • 22. Delta Lake
  • 1. Section Introduction
  • 2. Drawbacks of ADLS
  • 3. What is Delta lake
  • 4. Lakehouse Architecture
  • 5.1 SchemaManagementDelta.csv
  • 5. Uploading required file for Delta lake
  • 6.1 1 problems in data lake and creating delta lake.zip
  • 6. Problems with Azure Datalake - Practical
  • 7. Creating a Delta lake
  • 8. Understanding Delta format
  • 9.1 2 understanding transaction log file.zip
  • 9. Contents of Transaction Log or Delta log file - Practical
  • 10. Contents of a transaction log demo
  • 11.1 3 creating delta tables using sql by path.zip
  • 11. Creating delta table by Path using SQL
  • 12.1 4 creating delta table in metastore pyspark and sql.zip
  • 12. Creating delta table in Metastore using Pyspark and SQL
  • 13.1 lesscols.zip
  • 13.2 SchemaDifferDataType.csv
  • 13.3 schemaextracolumn1.zip
  • 13. Schema Enforcement - Files required for Understanding Schema Enforcement -
  • 14. What is schema enforcement - Demo
  • 15.1 4 creating delta table in metastore pyspark and sql.zip
  • 15. Schema Enforcement - Practical
  • 16.1 4 creating delta table in metastore pyspark and sql.zip
  • 16. Schema Evolution - Practical
  • 17.1 6 versioning and time travel.zip
  • 17. 16. Versioning and Time Travel
  • 18.1 7 vacuum command.zip
  • 18. Vacuum command
  • 19.1 8 convert to delta lake and checkpoints.zip
  • 19. Convert to Delta command
  • 20.1 8 convert to delta lake and checkpoints.zip
  • 20. Checkpoints in delta log
  • 21. Optimize command - Demo
  • 22.1 9 optimize command.zip
  • 22. Optimize command - Practical
  • 23.1 10 - upsert using merge command.zip
  • 23. Applying UPSERT using MERGE Command

  • 23. Conclusion
  • 1. Course Conclusion
  • 2. Bonus Lecture.html
  • 139,000 تومان
    بیش از یک محصول به صورت دانلودی میخواهید؟ محصول را به سبد خرید اضافه کنید.
    خرید دانلودی فوری

    در این روش نیاز به افزودن محصول به سبد خرید و تکمیل اطلاعات نیست و شما پس از وارد کردن ایمیل خود و طی کردن مراحل پرداخت لینک های دریافت محصولات را در ایمیل خود دریافت خواهید کرد.

    ایمیل شما:
    تولید کننده:
    شناسه: 20528
    حجم: 7964 مگابایت
    مدت زمان: 1120 دقیقه
    تاریخ انتشار: ۱۵ مهر ۱۴۰۲
    طراحی سایت و خدمات سئو

    139,000 تومان
    افزودن به سبد خرید