وب سایت تخصصی شرکت فرین
دسته بندی دوره ها
2

Basics to Advanced: Azure Synapse Analytics Hands-On Project

سرفصل های دوره

Build complete project only with Azure Synapse Analytics focused on PySpark includes delta lake and spark Optimizations


1. Introduction
  • 1. Introduction
  • 2. Project Architecture
  • 3.1 Synapse Project Deck.pdf
  • 3. Course Slides.html

  • 2. Origin of Azure Synapse Analytics
  • 1. Section Introduction
  • 2. Need of separate Analytical system
  • 3. OLAP vs OLTP
  • 4. A typical Datawarehouse
  • 5. Datalake Introduction
  • 6. Modern datawarehouse and its problem
  • 7. The solution - Azure Synapse Analytics and its Components
  • 8. Azure Synapse Analytics - A Single stop solution
  • 9. Section Summary

  • 3. Environment Setup
  • 1. Section Introduction
  • 2. Creating a resource group in Azure
  • 3. Create Azure Synapse Analytics Service
  • 4. Exploring Azure Synapse Analytics
  • 5. Understanding the dataset

  • 4. Serverless SQL Pool
  • 1. Section Introduction
  • 2. Serverless SQL Pool - Introduction
  • 3. Serverless SQL Pool - Architecture
  • 4. Serverless SQL Pool- Benefits and Pricing
  • 5.1 Unemployment.csv
  • 5.2 unemployment.zip
  • 5. Uploading files into Azure Datalake Storage
  • 6.1 1 data exploration.zip
  • 6.2 Openrowset.html
  • 6. Initial Data Exploration
  • 7. How to import SQL scripts or ipynb notebooks to Azure Synapse
  • 8.1 2 fixing collation warning.zip
  • 8. Fixing the Collation warning
  • 9.1 3 creating external datasource.zip
  • 9. Creating External datasource
  • 10.1 4 creating database scoped credential sas.zip
  • 10. Creating database scoped credential Using SAS
  • 11.1 5 creating database scoped credential mi.zip
  • 11. Creating Database scoped cred using MI
  • 12. Deleting existing data sources for cleanup
  • 13. Creating an external file format - Demo
  • 14.1 6 create external file format.zip
  • 14. Creating an External File Format - Practical
  • 15. Creating External DataSource for Refined container
  • 16.1 7 creating external table.zip
  • 16. Creating an External Table
  • 17. End of section

  • 5. History and Data processing before Spark
  • 1. Section Introduction
  • 2. Big Data Approach
  • 3. Understanding Hadoop Yarn- Cluster Manager
  • 4. Understanding Hadoop - HDFS
  • 5. Understanding Hadoop - MapReduce Distributed Computing

  • 6. Emergence of Spark
  • 1. Section Introduction
  • 2. Drawbacks of MapReduce Framework
  • 3. Emergence of Spark

  • 7. Spark Core Concepts
  • 1. Section Introduction
  • 2. Spark EcoSystem
  • 3. Difference between Hadoop & Spark
  • 4. Spark Architecture
  • 5. Creating a Spark Pool & its benefits
  • 6. RDD Overview
  • 7. Functions Lambda, Map and Filter - Overview
  • 8.1 10 understanding rdd in practical.zip
  • 8. Understanding RDD in practical
  • 9. RDD- Lazy loading - Transformations and Actions
  • 10. What is RDD Lineage
  • 11. RDD - Word count program - Demo
  • 12.1 14 word count pyspark program practical.zip
  • 12.2 tonystark.txt
  • 12. RDD - Word count - PySpark Program - Practical
  • 13. Optimization - ReduceByKey vs GroupByKey Explanation
  • 14. RDD - Understanding about Jobs in spark Practical
  • 15. RDD - Understanding Narrow and Wide Transformations
  • 16. RDD- Understanding Stages - Practical
  • 17.1 18 rdd understanding tasks practical.zip
  • 17. RDD- Understanding Tasks Practical
  • 18. Understand DAG , RDD Lineage and Differences
  • 19. Spark Higher level APIs Intro
  • 20.1 2023-01-15 213417.413947.csv
  • 20.2 2023-01-15 213417.413947.zip
  • 20.3 2023-01-15 213417.413947.zip
  • 20.4 dataframe practical.zip
  • 20. Synapse Notebook - Creating dataframes practical

  • 8. PySpark Transformation 1 - Select and Filter functions
  • 1. Introduction for PySpark Transformations
  • 2.1 1 walkthough on notebook.zip
  • 2. Walkthrough on Notebook , Markdown cells
  • 3.1 Databricks login.html
  • 3.2 Databricks Signup.html
  • 3. Using Free Databricks Community Edition to practise and Save Costs
  • 4.1 2 display and show functions.zip
  • 4. Display and show Functions
  • 5. Stop Spark Session when not in use
  • 6.1 3 select and selectexpr.zip
  • 6. Select and SelectExpr
  • 7.1 4 filter function.zip
  • 7. Filter Function
  • 8. Organizing notebooks into a folder

  • 9. PySpark Transformation 2 - Handling Nulls, Duplicates and aggregation
  • 1.1 1 understanding fillna and nadotfill.zip
  • 1. Understanding fillna and na.fill
  • 2.1 2 handling duplicates and dropna.zip
  • 2. Identifying duplicates using Aggregations
  • 3.1 2 handling duplicates and dropna.zip
  • 3. Handling Duplicates using dropna
  • 4. Organising notebooks into a folder
  • 5. Transformations summary of this section

  • 10. PySpark Transformation 3 - Data Transformation and Manipulation
  • 1.1 3 data transformation and manipulation.zip
  • 1. withColumn to Create Update columns
  • 2.1 3 data transformation and manipulation.zip
  • 2. Transforming and updating column withColumnRenamed

  • 11. PySpark 4 - Synapse Spark - MSSparkUtils
  • 1. What is MSSpark Utilities
  • 2.1 1 mssparkutils env.zip
  • 2. MSSpark Utils - Env utils
  • 3. What is mount point
  • 4.1 2 msspark utils fs mount.zip
  • 4. Creating and accessing mount point in Notebook
  • 5.1 3 msspark utils fs utils.zip
  • 5. All File System Utils
  • 6.1 4 a notebook parent.zip
  • 6. Notebook Utils - Exit command
  • 7.1 Synapse Quotas.html
  • 7. Creating another spark pool
  • 8.1 To Submit ticket for quota increase.html
  • 8. Procedure to increase vCores request (optional)
  • 9.1 4 a notebook child.zip
  • 9.2 4 a notebook parent.zip
  • 9. Calling notebook from another notebook
  • 10.1 4 a notebook parent para.zip
  • 10. Calling notebook from another using runtime parameters
  • 11.1 5 magic commands.zip
  • 11. Magic commands
  • 12.1 FAQ.html
  • 12. Attaching two notebooks to a single spark pool
  • 13.1 6 1 accessing mount configuration.zip
  • 13.2 6 mount configuration.zip
  • 13. Accessing Mount points from another notebook

  • 12. PySpark 5 - Synapse - Spark SQL
  • 1.1 1 accessing data using temporary views practical.zip
  • 1. Accessing data using Temporary Views - Practical
  • 2. Lake Database - Overview
  • 3.1 2 creating database in lake database.zip
  • 3. Understanding and creating database in Lake Database
  • 4.1 2 creating database in lake database.zip
  • 4. Using Spark SQL in notebook
  • 5.1 3 managed vs external tables.zip
  • 5. Managed vs External tables in Spark
  • 6. Metadata sharing between Spark pool and Serverless SQL Pool
  • 7. Deleting unwanted folders

  • 13. PySpark Transformation 6 - Join Transformations
  • 1.1 Education and Expected Salary ranges.csv
  • 1.2 Education Details.csv
  • 1.3 Salary Details.csv
  • 1. Uploading required files for Joins
  • 2.1 1 understanding joins and union.zip
  • 2. Python notebooks till Union.html
  • 3. Inner join
  • 4. Left Join
  • 5. Right Join
  • 6. Full outer join
  • 7. Left Semi Join
  • 8. Left anti and Cross Join
  • 9. Union Operation
  • 10.1 2 performing join transformation.zip
  • 10. Performing Join Transformation on Project Dataset
  • 11. Summary of Transformations performed

  • 14. PySpark Transformation 7 - String Manipulation and sorting
  • 1. Replace function to change spaces
  • 2.1 1 string manipulation and sorting.zip
  • 2. PySpark Notebook for this section.html
  • 3. Split and concat functions
  • 4. Order by and sort
  • 5. Section Summary

  • 15. PySpark Transformation 8 - Window Functions
  • 1. Row number function
  • 2.1 1 window functions.zip
  • 2. PySpark Notebook used in this section.html
  • 3. Rank Function
  • 4. Dense Rank function

  • 16. PySpark Transformation 9 - Conversions and Pivoting
  • 1. Conversion using cast function
  • 2.1 1 cast and pivoting.zip
  • 2. PySpark Notebook need for casting and pivoting lectures.html
  • 3. Pivot function
  • 4. Unpivot using stack function
  • 5.1 2 to date+function.zip
  • 5.2 Databricks - Datetime Patterns.html
  • 5.3 Microsoft Docs - Date time patterns.html
  • 5.4 Microsoft Docs - Datetime.html
  • 5. Using to date to convert date column

  • 17. PySpark Transformation 10 - Schema definition and Management
  • 1.1 1 schema definition and management.zip
  • 1. PySpark Notebook used in this lecture.html
  • 2. StructType and StructField - Demo
  • 3. Implementing explicit schema with StructType and StructField

  • 18. PySpark Transformation 11 - UDFs
  • 1. User Defined Functions - Demo
  • 2.1 1 udfs.zip
  • 2. Implementing UDFs in Notebook
  • 3.1 1 writing data to processed container.zip
  • 3. Writing transformed data to Processed container

  • 19. Dedicated SQL Pool
  • 1. Dedicated SQL pool - Demo
  • 2. Dedicated SQL Pool Architecture
  • 3. How distribution takes places based on DWU
  • 4. Factors to consider when choosing dedicated SQL pool
  • 5. Creating Dedicated SQL pool in Synapse
  • 6. Ways to copy data into Dedicated SQL Pool
  • 7.1 1 copy command to get data into dedicated sql pool.zip
  • 7. Copy command to copy to dedicated SQL pool
  • 8. Clustured Column Store index(optional)
  • 9. Types of Distributions or Sharing patterns
  • 10. Using Pipeline to Copy to dedicated SQL Pool

  • 20. Reporting data to Power BI
  • 1. Section Introduction
  • 2. Installing Power BI Desktop
  • 3. Creating report from Power BI Desktop
  • 4. Creating new user in Azure AD for creating workspace (if using personal account)
  • 5. Creating a shared workspace in Power BI
  • 6. Publishing report to Shared Workspace
  • 7. Accessing Power BI from Azure Synapse Analytics
  • 8.1 synapse power bi report.zip
  • 8. Download Power BI .pbix file from here.html
  • 9. Creating Dataset and report from Synapse Analytics
  • 10. Concluding the Power BI Section
  • 11. Summary and end of project implementation

  • 21. Spark - Optimisation Techniques
  • 1. Optimisation Section Intro
  • 2.1 cache.csv
  • 2.2 partition.zip
  • 2.3 Unemployment collect.csv
  • 2.4 Unemployment inferschema.csv
  • 2. Uploading required files for Optimisation
  • 3. Spark Optimisation levels
  • 4.1 1 optimization avoid collect.zip
  • 4. Avoid using Collect function
  • 5. Making notebook into particular folder
  • 6.1 2 avoid infer schema.zip
  • 6. Avoid InferSchema
  • 7. Use Cache Persist 1 - Understanding Serialization and DeSerialization
  • 8. Use Cache Persist 2 - How cache or persist will work - Demo
  • 9.1 3 cache.zip
  • 9. Use Cache Persist 3 - Understanding cache practically
  • 10. Use Cache Persist 4 - Persist - What is persist and different storage levels
  • 11.1 4 persist.zip
  • 11.2 storage level notes.zip
  • 11. Use Cache Persist - Notebook for persist with all storage levels.html
  • 12. Use Cache Persist 5 - Persist - MEMORY ONLY
  • 13. Use Cache Persist 6 - Persist - MEMORY AND DISK
  • 14. Use Cache Persist 7 - Persist - MEMORY ONLY SER (Scala Only)
  • 15. Use Cache Persist 8 - Persist - MEMORY AND DISK SER ( Scala Only)
  • 16. Use Cache Persist 9 - Persist - DISK ONLY
  • 17. Use Cache Persist 10 - Persist - OFF HEAP (Scala Only)
  • 18. Use Cache Persist 11 - Persist - MEMORY ONLY 2 (PySpark only)
  • 19. Use Partitioning 1 - Understanding partitioning - Demo
  • 20.1 4 paritioning.zip
  • 20. Use Partitioning 2 - Understand partitioning - Practical
  • 21. Repartiton and coalesce 1 - Understanding repartition and coalesce - Demo
  • 22. Repartiton and coalesce 2 - Understanding repartition and coalesce - Practical
  • 23. Broadcast variables 1 - Understanding broadcast variables - Demo
  • 24.1 6 broadcast variables.zip
  • 24. Broadcast variables 2 - Implementing broadcast variables in notebook
  • 25. Use Kryo Serializer

  • 22. Delta Lake
  • 1. Section Introduction
  • 2. Drawbacks of ADLS
  • 3. What is Delta lake
  • 4. Lakehouse Architecture
  • 5.1 SchemaManagementDelta.csv
  • 5. Uploading required file for Delta lake
  • 6.1 1 problems in data lake and creating delta lake.zip
  • 6. Problems with Azure Datalake - Practical
  • 7. Creating a Delta lake
  • 8. Understanding Delta format
  • 9.1 2 understanding transaction log file.zip
  • 9. Contents of Transaction Log or Delta log file - Practical
  • 10. Contents of a transaction log demo
  • 11.1 3 creating delta tables using sql by path.zip
  • 11. Creating delta table by Path using SQL
  • 12.1 4 creating delta table in metastore pyspark and sql.zip
  • 12. Creating delta table in Metastore using Pyspark and SQL
  • 13.1 lesscols.zip
  • 13.2 SchemaDifferDataType.csv
  • 13.3 schemaextracolumn1.zip
  • 13. Schema Enforcement - Files required for Understanding Schema Enforcement -
  • 14. What is schema enforcement - Demo
  • 15.1 4 creating delta table in metastore pyspark and sql.zip
  • 15. Schema Enforcement - Practical
  • 16.1 4 creating delta table in metastore pyspark and sql.zip
  • 16. Schema Evolution - Practical
  • 17.1 6 versioning and time travel.zip
  • 17. 16. Versioning and Time Travel
  • 18.1 7 vacuum command.zip
  • 18. Vacuum command
  • 19.1 8 convert to delta lake and checkpoints.zip
  • 19. Convert to Delta command
  • 20.1 8 convert to delta lake and checkpoints.zip
  • 20. Checkpoints in delta log
  • 21. Optimize command - Demo
  • 22.1 9 optimize command.zip
  • 22. Optimize command - Practical
  • 23.1 10 - upsert using merge command.zip
  • 23. Applying UPSERT using MERGE Command

  • 23. Conclusion
  • 1. Course Conclusion
  • 2. Bonus Lecture.html
  • 189,000 تومان
    بیش از یک محصول به صورت دانلودی میخواهید؟ محصول را به سبد خرید اضافه کنید.
    افزودن به سبد خرید
    خرید دانلودی فوری

    در این روش نیاز به افزودن محصول به سبد خرید و تکمیل اطلاعات نیست و شما پس از وارد کردن ایمیل خود و طی کردن مراحل پرداخت لینک های دریافت محصولات را در ایمیل خود دریافت خواهید کرد.

    ایمیل شما:
    تولید کننده:
    شناسه: 20528
    حجم: 7964 مگابایت
    مدت زمان: 1120 دقیقه
    تاریخ انتشار: ۱۵ مهر ۱۴۰۲
    طراحی سایت و خدمات سئو

    189,000 تومان
    افزودن به سبد خرید