وب سایت تخصصی شرکت فرین
دسته بندی دوره ها

Reactive Kafka From Scratch

سرفصل های دوره

1. Introduction
  • 1. Introduction
  • 2. Important Note for first time Data Engineering Customers
  • 3. Important Note for Data Engineering Essentials (Python and Spark) Customers
  • 4. How to get 30 days complementary lab access

  • 2. Setting up Environment using AWS Cloud9
  • 1. Getting Started with Cloud9
  • 2. Creating Cloud9 Environment
  • 3. Warming up with Cloud9 IDE
  • 4. Overview of EC2 related to Cloud9
  • 5. Opening ports for Cloud9 Instance
  • 6. Associating Elastic IPs to Cloud9 Instance
  • 7. Increase EBS Volume Size of Cloud9 Instance
  • 8. Setup Jupyter Lab on Cloud9
  • 9. [Commands] Setup Jupyter Lab on Cloud9.html

  • 3. Setting up Environment - Overview of GCP and Provision Ubuntu VM
  • 1.1 Signing up for GCP.html
  • 1. Signing up for GCP
  • 2.1 Understanding GCP Web Console.html
  • 2. Overview of GCP Web Console
  • 3.1 Overview of GCP Pricing.html
  • 3. Overview of GCP Pricing
  • 4.1 Provision Ubuntu 18.04 Virtual Machine.html
  • 4. Provision Ubuntu VM from GCP
  • 5.1 Setup Docker.html
  • 5. Setup Docker
  • 6.1 Validating Python.html
  • 6. Validating Python
  • 7.1 Setup Jupyter Lab.html
  • 7. Setup Jupyter Lab
  • 8. Setup Jupyter Lab locally on Mac

  • 4. Setup Single Node Hadoop Cluster
  • 1. Introduction to Single Node Hadoop Cluster
  • 2. Setup Prerequisites
  • 3. Setup Password less login
  • 4. Download and Install Hadoop
  • 5. Configure Hadoop HDFS
  • 6. Start and Validate HDFS
  • 7. Configure Hadoop YARN
  • 8. Start and Validate YARN
  • 9. Managing Single Node Hadoop

  • 5. Setup Hive and Spark
  • 1. Setup Data Sets for Practice
  • 2. Download and Install Hive
  • 3. Setup Database for Hive Metastore
  • 4. Configure and Setup Hive Metastore
  • 5. Launch and Validate Hive
  • 6. Scripts to Manage Single Node Cluster
  • 7. Download and Install Spark 2
  • 8. Configure Spark 2
  • 9. Validate Spark 2 using CLIs
  • 10. Validate Jupyter Lab Setup
  • 11. Integrate Spark 2 with Jupyter Lab
  • 12. Download and Install Spark 3
  • 13. Configure Spark 3
  • 14. Validate Spark 3 using CLIs
  • 15. Integrate Spark 3 with Jupyter Lab

  • 6. Setup Single Node Kafka Cluster
  • 1. Download and Install Kafka
  • 2. Configure and Start Zookeeper
  • 3. Configure and Start Kafka Broker
  • 4. Scripts to manage single node cluster
  • 5. Overview of Kafka CLI
  • 6. Setup Retail log Generator
  • 7. Redirecting logs to Kafka

  • 7. Getting Started with Kafka
  • 1. Overview of Kafka
  • 2. Managing Topics using Kafka CLI
  • 3. Produce and Consume Messages using CLI
  • 4. Validate Generation of Web Server Logs
  • 5. Create Web Server using nc
  • 6. Produce retail logs to Kafka Topic
  • 7. Consume retail logs from Kafka Topic
  • 8. Clean up Kafka CLI Sessions to produce and consume messages
  • 9. Define Kafka Connect to produce
  • 10. Validate Kafka Connect to produce

  • 8. Data Ingestion using Kafka Connect
  • 1. Overview of Kafka Connect
  • 2. Define Kafka Connect to Produce Messages
  • 3. Validate Kafka Connect to produce messages
  • 4. Cleanup Kafka Connect to produce messages
  • 5. Write Data to HDFS using Kafka Connect
  • 6. Setup HDFS 3 Sink Connector Plugin
  • 7. Overview of Kafka Consumer Groups
  • 8. Configure HDFS 3 Sink Properties
  • 9. Run and Validate HDFS 3 Sink
  • 10. Cleanup Kafka Connect to consume messages

  • 9. Overview of Spark Structured Streaming
  • 1. Understanding Streaming Context
  • 2. Validate Log Data for Streaming
  • 3. Push log messages to Netcat Webserver
  • 4. Overview of built-in Input Sources
  • 5. Reading Web Server logs using Spark Structured Streaming
  • 6. Overview of Output Modes
  • 7. Using append as Output Mode
  • 8. Using complete as Output Mode
  • 9. Using update as Output Mode
  • 10. Overview of Triggers in Spark Structured Streaming
  • 11. Overview of built-in Output Sinks
  • 12. Previewing the Streaming Data

  • 10. Kafka and Spark Structured Streaming Integration
  • 1. Create Kafka Topic
  • 2. Read Data from Kafka Topic
  • 3. Preview data using console
  • 4. Preview data using memory
  • 5. Transform Data using Spark APIs
  • 6. Write Data to HDFS using Spark
  • 7. Validate Data in HDFS using Spark
  • 8. Write Data to HDFS using Spark using Header
  • 9. Cleanup Kafka Connect and Files in HDFS

  • 11. Incremental Loads using Spark Structured Streaming
  • 1. Overview of Spark Structured Streaming Triggers
  • 2. Steps for Incremental Data Processing
  • 3. Create Working Directory in HDFS
  • 4. Logic to Upload GHArchive Files
  • 5. Upload GHArchive Files to HDFS
  • 6. Add new GHActivity JSON Files
  • 7. Read JSON Data using Spark Structured streaming
  • 8. Write in Parquet File Format
  • 9. Analyze GHArchive Data in Parquet files using Spark
  • 10. Add New GHActivity JSON files
  • 11. Load Data Incrementally to Target Table
  • 12. Validate Incremental Load
  • 13. Add New GHActivity JSON files
  • 14. Using maxFilerPerTrigger and latestFirst
  • 15. Validate Incremental Load
  • 16. Add New GHActivity JSON files
  • 17. Incremental Load using Archival Process
  • 18. Validate Incremental Load
  • 139,000 تومان
    بیش از یک محصول به صورت دانلودی میخواهید؟ محصول را به سبد خرید اضافه کنید.
    خرید دانلودی فوری

    در این روش نیاز به افزودن محصول به سبد خرید و تکمیل اطلاعات نیست و شما پس از وارد کردن ایمیل خود و طی کردن مراحل پرداخت لینک های دریافت محصولات را در ایمیل خود دریافت خواهید کرد.

    ایمیل شما:
    تولید کننده:
    شناسه: 12530
    حجم: 4142 مگابایت
    مدت زمان: 567 دقیقه
    تاریخ انتشار: ۲۰ خرداد ۱۴۰۲
    طراحی سایت و خدمات سئو

    139,000 تومان
    افزودن به سبد خرید