وب سایت تخصصی شرکت فرین
دسته بندی دوره ها

Apache Spark 3 Fundamentals

سرفصل های دوره

Learn the Fundamentals of Apache Spark 3: process data, set up the environment, use RDDs & DataFrames, optimize apps, build pipelines with Databricks and Azure Synapse. Familiarize yourself with Spark's ecosystem here in this course.


1. Course Overview
  • 1. Course Trailer

  • 2. Getting Started with Apache Spark
  • 1. Introduction and Course Outline
  • 2. Version Check
  • 3. Need for Apache Spark
  • 4. Understanding Spark Architecture and Ecosystem
  • 5. How Execution Happens in Spark
  • 6. Spark APIs RDDs, DataFrames and Datasets
  • 7. Summary

  • 03. Setting up Spark Environment
  • 01. Module Overview
  • 02. Understanding Spark Environments
  • 03. Installing Spark
  • 04. Monitoring Spark with Web UI
  • 05. Option 1 - Running Spark in Command Line
  • 06. Option 2 - Running Spark with Jupyter Notebooks
  • 07. Option 3 - Creating Project with PyCharm IDE
  • 08. Option 4 - Running Jobs with Spark Submit
  • 09. Setting Up Multi-Node Cluster
  • 10. Summary

  • 4. Working with RDDs - Resilient Distributed Datasets
  • 1. Module Overview
  • 2. Understanding RDDs
  • 3. Creating RDDs
  • 4. Working with Pair RDDs
  • 5. Applying Operations on RDDs
  • 6. Using Narrow Transformations
  • 7. Wide Transformations and Data Shuffling
  • 8. Spark Application Concepts - Jobs, Stages and Tasks
  • 9. Summary

  • 5. Cleaning and Transforming Data with DataFrames
  • 1. Module Overview
  • 2. Understanding DataFrames
  • 3. Creating DataFrames
  • 4. Applying Schemas
  • 5. Analyzing and Cleaning Data
  • 6. Applying Transformations
  • 7. Handling Corrupt Data
  • 8. Saving Processed Data to Files
  • 9. Summary

  • 6. Working with Spark SQL, UDFs, and Common DataFrame Operations
  • 1. Module Overview
  • 2. Running SQL Queries on DataFrames
  • 3. Working with Spark Tables
  • 4. Working with User Defined Functions (UDFs)
  • 5. Performing Operations on Multiple Datasets
  • 6. Performing Window Operations
  • 7. Summary

  • 07. Performing Optimizations in Spark
  • 01. Module Overview
  • 02. Working with Spark Partitions
  • 03. Changing DataFrame Partitions
  • 04. Memory Management
  • 05. Persisting Data
  • 06. Spark Join Strategies and Broadcast Joins
  • 07. Optimizing Shuffle Sort Join with Bucketing
  • 08. Dynamic Resource Allocation
  • 09. Resource Allocation Using Fair Scheduling
  • 10. Summary

  • 8. Features in Apache Spark 3
  • 1. Introduction to Apache Spark 3
  • 2. Adaptive Query Execution - Dynamic Coalescing
  • 3. Adaptive Query Execution - Dynamic Join
  • 4. Adaptive Query Execution - Handling Skew
  • 5. Dynamic Partition Pruning
  • 6. Summary

  • 09. Building Reliable Data Lake with Spark and Delta Lake
  • 01. Module Overview
  • 02. Need for Delta Lake with Spark
  • 03. How Delta Lake Works
  • 04. ACID Guarantees on Delta Lake
  • 05. Creating Delta Tables
  • 06. Inserting Data to Delta Table
  • 07. Performing DML Operations
  • 08. Applying Table Constraints
  • 09. Accessing Data with Time Travel
  • 10. Summary

  • 10. Handling Streaming Data with Spark Structured Streaming
  • 1. Module Overview
  • 2. Understanding Streaming in Spark
  • 3. Structured Streaming Processing Model
  • 4. Extracting Streaming Data from Source
  • 5. Transforming and Loading Data
  • 6. Summary

  • 11. Working with Spark in Cloud
  • 1. Module Overview
  • 2. Using Spark in Databricks
  • 3. Using Spark in Azure Synapse Analytics
  • 4. Summary
  • 45,900 تومان
    بیش از یک محصول به صورت دانلودی میخواهید؟ محصول را به سبد خرید اضافه کنید.
    خرید دانلودی فوری

    در این روش نیاز به افزودن محصول به سبد خرید و تکمیل اطلاعات نیست و شما پس از وارد کردن ایمیل خود و طی کردن مراحل پرداخت لینک های دریافت محصولات را در ایمیل خود دریافت خواهید کرد.

    ایمیل شما:
    تولید کننده:
    مدرس:
    شناسه: 11689
    حجم: 753 مگابایت
    مدت زمان: 379 دقیقه
    تاریخ انتشار: 21 اردیبهشت 1402
    طراحی سایت و خدمات سئو

    45,900 تومان
    افزودن به سبد خرید