Big Data Analytics with Hadoop and Apache Spark

سرفصل های دوره

Apache Hadoop was a pioneer in the world of big data technologies, and it continues to lead in enterprise big data storage. Apache Spark is the top big data processing engine and provides an impressive array of features and capabilities. When used together, the Hadoop Distributed File System (HDFS) and Spark can provide a truly scalable setup for big data analytics. In this course, data analytics expert Kumaran Ponnambalam shows you how to leverage these two technologies to build scalable and optimized data analytics pipelines. Explore ways to optimize data modeling and storage on HDFS; discuss scalable data ingestion and extraction using Spark; and review actionable tips for optimizing data processing in Spark. Plus, complete a use case project that allows you to practice your new techniques.

01 - Introduction

01 - The combined power of Spark and Hadoop Distributed File System (HDFS)

02 - 1. Introduction and Setup

01 - Apache Hadoop overview

02 - Apache Spark overview

03 - Integrating Spark and Hadoop

04 - Using exercise files

03 - 2. HDFS Data Modeling for Analytics

01 - Storage formats

02 - Compression

03 - Partitioning

04 - Bucketing

05 - Best practices for data storage

04 - 3. Data Ingestion with Spark

01 - Reading external files into Spark

02 - Writing to HDFS

03 - Parallel writes with partitioning

04 - Parallel writes with bucketing

05 - Best practices for ingestion

05 - 4. Data Extraction with Spark

01 - How Spark works

02 - Reading HDFS files with schema

03 - Reading partitioned data

04 - Reading bucketed data

05 - Best practices for data extraction

06 - 5. Optimizing Spark Processing

01 - Pushing down projections

02 - Pushing down filters

03 - Managing partitions

04 - Improving joins

05 - Storing intermediate results

06 - Best practices for data processing

07 - 6. Use Case Project

01 - Problem definition

02 - Data loading

03 - Total score analytics

04 - Average score analytics

05 - Top student analytics

08 - Conclusion

01 - Continuing on with big data analytics

189,000 تومان

افزودن به سبد خرید

خرید دانلودی فوری

در این روش نیاز به افزودن محصول به سبد خرید و تکمیل اطلاعات نیست و شما پس از وارد کردن ایمیل خود و طی کردن مراحل پرداخت لینک های دریافت محصولات را در ایمیل خود دریافت خواهید کرد.

تولید کننده: LinkedIn Learning (Lynda)