Reactive Kafka From Scratch

سرفصل های دوره

1. Introduction

1. Introduction

2. Important Note for first time Data Engineering Customers

3. Important Note for Data Engineering Essentials (Python and Spark) Customers

4. How to get 30 days complementary lab access

2. Setting up Environment using AWS Cloud9

1. Getting Started with Cloud9

2. Creating Cloud9 Environment

3. Warming up with Cloud9 IDE

4. Overview of EC2 related to Cloud9

5. Opening ports for Cloud9 Instance

6. Associating Elastic IPs to Cloud9 Instance

7. Increase EBS Volume Size of Cloud9 Instance

8. Setup Jupyter Lab on Cloud9

9. [Commands] Setup Jupyter Lab on Cloud9.html

3. Setting up Environment - Overview of GCP and Provision Ubuntu VM

1.1 Signing up for GCP.html

1. Signing up for GCP

2.1 Understanding GCP Web Console.html

2. Overview of GCP Web Console

3.1 Overview of GCP Pricing.html

3. Overview of GCP Pricing

4.1 Provision Ubuntu 18.04 Virtual Machine.html

4. Provision Ubuntu VM from GCP

5.1 Setup Docker.html

5. Setup Docker

6.1 Validating Python.html

6. Validating Python

7.1 Setup Jupyter Lab.html

7. Setup Jupyter Lab

8. Setup Jupyter Lab locally on Mac

4. Setup Single Node Hadoop Cluster

1. Introduction to Single Node Hadoop Cluster

2. Setup Prerequisites

3. Setup Password less login

4. Download and Install Hadoop

5. Configure Hadoop HDFS

6. Start and Validate HDFS

7. Configure Hadoop YARN

8. Start and Validate YARN

9. Managing Single Node Hadoop

5. Setup Hive and Spark

1. Setup Data Sets for Practice

2. Download and Install Hive

3. Setup Database for Hive Metastore

4. Configure and Setup Hive Metastore

5. Launch and Validate Hive

6. Scripts to Manage Single Node Cluster

7. Download and Install Spark 2

8. Configure Spark 2

9. Validate Spark 2 using CLIs

10. Validate Jupyter Lab Setup

11. Integrate Spark 2 with Jupyter Lab

12. Download and Install Spark 3

13. Configure Spark 3

14. Validate Spark 3 using CLIs

15. Integrate Spark 3 with Jupyter Lab

6. Setup Single Node Kafka Cluster

1. Download and Install Kafka

2. Configure and Start Zookeeper

3. Configure and Start Kafka Broker

4. Scripts to manage single node cluster

5. Overview of Kafka CLI

6. Setup Retail log Generator

7. Redirecting logs to Kafka

7. Getting Started with Kafka

1. Overview of Kafka

2. Managing Topics using Kafka CLI

3. Produce and Consume Messages using CLI

4. Validate Generation of Web Server Logs

5. Create Web Server using nc

6. Produce retail logs to Kafka Topic

7. Consume retail logs from Kafka Topic

8. Clean up Kafka CLI Sessions to produce and consume messages

9. Define Kafka Connect to produce

10. Validate Kafka Connect to produce

8. Data Ingestion using Kafka Connect

1. Overview of Kafka Connect

2. Define Kafka Connect to Produce Messages

3. Validate Kafka Connect to produce messages

4. Cleanup Kafka Connect to produce messages

5. Write Data to HDFS using Kafka Connect

6. Setup HDFS 3 Sink Connector Plugin

7. Overview of Kafka Consumer Groups

8. Configure HDFS 3 Sink Properties

9. Run and Validate HDFS 3 Sink

10. Cleanup Kafka Connect to consume messages

9. Overview of Spark Structured Streaming

1. Understanding Streaming Context

2. Validate Log Data for Streaming

3. Push log messages to Netcat Webserver

4. Overview of built-in Input Sources

5. Reading Web Server logs using Spark Structured Streaming

6. Overview of Output Modes

7. Using append as Output Mode

8. Using complete as Output Mode

9. Using update as Output Mode

10. Overview of Triggers in Spark Structured Streaming

11. Overview of built-in Output Sinks

12. Previewing the Streaming Data

10. Kafka and Spark Structured Streaming Integration

1. Create Kafka Topic

2. Read Data from Kafka Topic

3. Preview data using console

4. Preview data using memory

5. Transform Data using Spark APIs

6. Write Data to HDFS using Spark

7. Validate Data in HDFS using Spark

8. Write Data to HDFS using Spark using Header

9. Cleanup Kafka Connect and Files in HDFS

11. Incremental Loads using Spark Structured Streaming

1. Overview of Spark Structured Streaming Triggers

2. Steps for Incremental Data Processing

3. Create Working Directory in HDFS

4. Logic to Upload GHArchive Files

5. Upload GHArchive Files to HDFS

6. Add new GHActivity JSON Files

7. Read JSON Data using Spark Structured streaming

8. Write in Parquet File Format

9. Analyze GHArchive Data in Parquet files using Spark

10. Add New GHActivity JSON files

11. Load Data Incrementally to Target Table

12. Validate Incremental Load

13. Add New GHActivity JSON files

14. Using maxFilerPerTrigger and latestFirst

15. Validate Incremental Load

16. Add New GHActivity JSON files

17. Incremental Load using Archival Process

18. Validate Incremental Load

45,900 تومان

خرید اشتراک افزودن به سبد خرید

خرید دانلودی فوری

در این روش نیاز به افزودن محصول به سبد خرید و تکمیل اطلاعات نیست و شما پس از وارد کردن ایمیل خود و طی کردن مراحل پرداخت لینک های دریافت محصولات را در ایمیل خود دریافت خواهید کرد.

تولید کننده: Udemy-Training