وب سایت تخصصی شرکت فرین
دسته بندی دوره ها
1

Apache Spark 3 Fundamentals

سرفصل های دوره

Learn the Fundamentals of Apache Spark 3: process data, set up the environment, use RDDs & DataFrames, optimize apps, build pipelines with Databricks and Azure Synapse. Familiarize yourself with Spark's ecosystem here in this course.


1. Course Overview
  • 1. Course Trailer

  • 2. Getting Started with Apache Spark
  • 1. Introduction and Course Outline
  • 2. Version Check
  • 3. Need for Apache Spark
  • 4. Understanding Spark Architecture and Ecosystem
  • 5. How Execution Happens in Spark
  • 6. Spark APIs RDDs, DataFrames and Datasets
  • 7. Summary

  • 03. Setting up Spark Environment
  • 01. Module Overview
  • 02. Understanding Spark Environments
  • 03. Installing Spark
  • 04. Monitoring Spark with Web UI
  • 05. Option 1 - Running Spark in Command Line
  • 06. Option 2 - Running Spark with Jupyter Notebooks
  • 07. Option 3 - Creating Project with PyCharm IDE
  • 08. Option 4 - Running Jobs with Spark Submit
  • 09. Setting Up Multi-Node Cluster
  • 10. Summary

  • 4. Working with RDDs - Resilient Distributed Datasets
  • 1. Module Overview
  • 2. Understanding RDDs
  • 3. Creating RDDs
  • 4. Working with Pair RDDs
  • 5. Applying Operations on RDDs
  • 6. Using Narrow Transformations
  • 7. Wide Transformations and Data Shuffling
  • 8. Spark Application Concepts - Jobs, Stages and Tasks
  • 9. Summary

  • 5. Cleaning and Transforming Data with DataFrames
  • 1. Module Overview
  • 2. Understanding DataFrames
  • 3. Creating DataFrames
  • 4. Applying Schemas
  • 5. Analyzing and Cleaning Data
  • 6. Applying Transformations
  • 7. Handling Corrupt Data
  • 8. Saving Processed Data to Files
  • 9. Summary

  • 6. Working with Spark SQL, UDFs, and Common DataFrame Operations
  • 1. Module Overview
  • 2. Running SQL Queries on DataFrames
  • 3. Working with Spark Tables
  • 4. Working with User Defined Functions (UDFs)
  • 5. Performing Operations on Multiple Datasets
  • 6. Performing Window Operations
  • 7. Summary

  • 07. Performing Optimizations in Spark
  • 01. Module Overview
  • 02. Working with Spark Partitions
  • 03. Changing DataFrame Partitions
  • 04. Memory Management
  • 05. Persisting Data
  • 06. Spark Join Strategies and Broadcast Joins
  • 07. Optimizing Shuffle Sort Join with Bucketing
  • 08. Dynamic Resource Allocation
  • 09. Resource Allocation Using Fair Scheduling
  • 10. Summary

  • 8. Features in Apache Spark 3
  • 1. Introduction to Apache Spark 3
  • 2. Adaptive Query Execution - Dynamic Coalescing
  • 3. Adaptive Query Execution - Dynamic Join
  • 4. Adaptive Query Execution - Handling Skew
  • 5. Dynamic Partition Pruning
  • 6. Summary

  • 09. Building Reliable Data Lake with Spark and Delta Lake
  • 01. Module Overview
  • 02. Need for Delta Lake with Spark
  • 03. How Delta Lake Works
  • 04. ACID Guarantees on Delta Lake
  • 05. Creating Delta Tables
  • 06. Inserting Data to Delta Table
  • 07. Performing DML Operations
  • 08. Applying Table Constraints
  • 09. Accessing Data with Time Travel
  • 10. Summary

  • 10. Handling Streaming Data with Spark Structured Streaming
  • 1. Module Overview
  • 2. Understanding Streaming in Spark
  • 3. Structured Streaming Processing Model
  • 4. Extracting Streaming Data from Source
  • 5. Transforming and Loading Data
  • 6. Summary

  • 11. Working with Spark in Cloud
  • 1. Module Overview
  • 2. Using Spark in Databricks
  • 3. Using Spark in Azure Synapse Analytics
  • 4. Summary
  • 139,000 تومان
    بیش از یک محصول به صورت دانلودی میخواهید؟ محصول را به سبد خرید اضافه کنید.
    افزودن به سبد خرید
    خرید دانلودی فوری

    در این روش نیاز به افزودن محصول به سبد خرید و تکمیل اطلاعات نیست و شما پس از وارد کردن ایمیل خود و طی کردن مراحل پرداخت لینک های دریافت محصولات را در ایمیل خود دریافت خواهید کرد.

    ایمیل شما:
    تولید کننده:
    مدرس:
    شناسه: 11689
    حجم: 753 مگابایت
    مدت زمان: 379 دقیقه
    تاریخ انتشار: ۲۱ اردیبهشت ۱۴۰۲
    طراحی سایت و خدمات سئو

    139,000 تومان
    افزودن به سبد خرید