وب سایت تخصصی شرکت فرین
دسته بندی دوره ها

Reactive Kafka From Scratch

سرفصل های دوره

1. Introduction
  • 1. Introduction
  • 2. Important Note for first time Data Engineering Customers
  • 3. Important Note for Data Engineering Essentials (Python and Spark) Customers
  • 4. How to get 30 days complementary lab access

  • 2. Setting up Environment using AWS Cloud9
  • 1. Getting Started with Cloud9
  • 2. Creating Cloud9 Environment
  • 3. Warming up with Cloud9 IDE
  • 4. Overview of EC2 related to Cloud9
  • 5. Opening ports for Cloud9 Instance
  • 6. Associating Elastic IPs to Cloud9 Instance
  • 7. Increase EBS Volume Size of Cloud9 Instance
  • 8. Setup Jupyter Lab on Cloud9
  • 9. [Commands] Setup Jupyter Lab on Cloud9.html

  • 3. Setting up Environment - Overview of GCP and Provision Ubuntu VM
  • 1.1 Signing up for GCP.html
  • 1. Signing up for GCP
  • 2.1 Understanding GCP Web Console.html
  • 2. Overview of GCP Web Console
  • 3.1 Overview of GCP Pricing.html
  • 3. Overview of GCP Pricing
  • 4.1 Provision Ubuntu 18.04 Virtual Machine.html
  • 4. Provision Ubuntu VM from GCP
  • 5.1 Setup Docker.html
  • 5. Setup Docker
  • 6.1 Validating Python.html
  • 6. Validating Python
  • 7.1 Setup Jupyter Lab.html
  • 7. Setup Jupyter Lab
  • 8. Setup Jupyter Lab locally on Mac

  • 4. Setup Single Node Hadoop Cluster
  • 1. Introduction to Single Node Hadoop Cluster
  • 2. Setup Prerequisites
  • 3. Setup Password less login
  • 4. Download and Install Hadoop
  • 5. Configure Hadoop HDFS
  • 6. Start and Validate HDFS
  • 7. Configure Hadoop YARN
  • 8. Start and Validate YARN
  • 9. Managing Single Node Hadoop

  • 5. Setup Hive and Spark
  • 1. Setup Data Sets for Practice
  • 2. Download and Install Hive
  • 3. Setup Database for Hive Metastore
  • 4. Configure and Setup Hive Metastore
  • 5. Launch and Validate Hive
  • 6. Scripts to Manage Single Node Cluster
  • 7. Download and Install Spark 2
  • 8. Configure Spark 2
  • 9. Validate Spark 2 using CLIs
  • 10. Validate Jupyter Lab Setup
  • 11. Integrate Spark 2 with Jupyter Lab
  • 12. Download and Install Spark 3
  • 13. Configure Spark 3
  • 14. Validate Spark 3 using CLIs
  • 15. Integrate Spark 3 with Jupyter Lab

  • 6. Setup Single Node Kafka Cluster
  • 1. Download and Install Kafka
  • 2. Configure and Start Zookeeper
  • 3. Configure and Start Kafka Broker
  • 4. Scripts to manage single node cluster
  • 5. Overview of Kafka CLI
  • 6. Setup Retail log Generator
  • 7. Redirecting logs to Kafka

  • 7. Getting Started with Kafka
  • 1. Overview of Kafka
  • 2. Managing Topics using Kafka CLI
  • 3. Produce and Consume Messages using CLI
  • 4. Validate Generation of Web Server Logs
  • 5. Create Web Server using nc
  • 6. Produce retail logs to Kafka Topic
  • 7. Consume retail logs from Kafka Topic
  • 8. Clean up Kafka CLI Sessions to produce and consume messages
  • 9. Define Kafka Connect to produce
  • 10. Validate Kafka Connect to produce

  • 8. Data Ingestion using Kafka Connect
  • 1. Overview of Kafka Connect
  • 2. Define Kafka Connect to Produce Messages
  • 3. Validate Kafka Connect to produce messages
  • 4. Cleanup Kafka Connect to produce messages
  • 5. Write Data to HDFS using Kafka Connect
  • 6. Setup HDFS 3 Sink Connector Plugin
  • 7. Overview of Kafka Consumer Groups
  • 8. Configure HDFS 3 Sink Properties
  • 9. Run and Validate HDFS 3 Sink
  • 10. Cleanup Kafka Connect to consume messages

  • 9. Overview of Spark Structured Streaming
  • 1. Understanding Streaming Context
  • 2. Validate Log Data for Streaming
  • 3. Push log messages to Netcat Webserver
  • 4. Overview of built-in Input Sources
  • 5. Reading Web Server logs using Spark Structured Streaming
  • 6. Overview of Output Modes
  • 7. Using append as Output Mode
  • 8. Using complete as Output Mode
  • 9. Using update as Output Mode
  • 10. Overview of Triggers in Spark Structured Streaming
  • 11. Overview of built-in Output Sinks
  • 12. Previewing the Streaming Data

  • 10. Kafka and Spark Structured Streaming Integration
  • 1. Create Kafka Topic
  • 2. Read Data from Kafka Topic
  • 3. Preview data using console
  • 4. Preview data using memory
  • 5. Transform Data using Spark APIs
  • 6. Write Data to HDFS using Spark
  • 7. Validate Data in HDFS using Spark
  • 8. Write Data to HDFS using Spark using Header
  • 9. Cleanup Kafka Connect and Files in HDFS

  • 11. Incremental Loads using Spark Structured Streaming
  • 1. Overview of Spark Structured Streaming Triggers
  • 2. Steps for Incremental Data Processing
  • 3. Create Working Directory in HDFS
  • 4. Logic to Upload GHArchive Files
  • 5. Upload GHArchive Files to HDFS
  • 6. Add new GHActivity JSON Files
  • 7. Read JSON Data using Spark Structured streaming
  • 8. Write in Parquet File Format
  • 9. Analyze GHArchive Data in Parquet files using Spark
  • 10. Add New GHActivity JSON files
  • 11. Load Data Incrementally to Target Table
  • 12. Validate Incremental Load
  • 13. Add New GHActivity JSON files
  • 14. Using maxFilerPerTrigger and latestFirst
  • 15. Validate Incremental Load
  • 16. Add New GHActivity JSON files
  • 17. Incremental Load using Archival Process
  • 18. Validate Incremental Load
  • 139,000 تومان
    بیش از یک محصول به صورت دانلودی میخواهید؟ محصول را به سبد خرید اضافه کنید.
    افزودن به سبد خرید
    خرید دانلودی فوری

    در این روش نیاز به افزودن محصول به سبد خرید و تکمیل اطلاعات نیست و شما پس از وارد کردن ایمیل خود و طی کردن مراحل پرداخت لینک های دریافت محصولات را در ایمیل خود دریافت خواهید کرد.

    ایمیل شما:
    تولید کننده:
    شناسه: 12530
    حجم: 4142 مگابایت
    مدت زمان: 567 دقیقه
    تاریخ انتشار: ۲۰ خرداد ۱۴۰۲
    طراحی سایت و خدمات سئو

    139,000 تومان
    افزودن به سبد خرید