وب سایت تخصصی شرکت فرین
دسته بندی دوره ها

Master Data Engineering using GCP Data Analytics

سرفصل های دوره

Learn GCS for Data Lake, BigQuery for Data Warehouse, GCP Dataproc and Databricks for Big Data Pipelines


1. Introduction to Data Engineering using GCP Data Analytics
  • 1. Introduction to Data Engineering using GCP Data Analytics
  • 2. Pre-requisites for Data Engineering using GCP Data Analytics
  • 3. Highlights of the Data Engineering using GCP Data Analytics Course
  • 4. Overview of Udemy Platform to take course effectively
  • 5. Refund Policy and Request for Rating and Feedback

  • 2. Setup Environment for Data Engineering using GCP Data Analytics
  • 1. Review Data Engineering on GCP Folder
  • 2. Setup VS Code Workspace for Data Engineering on GCP
  • 3. Setup and Integrate Python 3.9 venv with VS Code Workspace

  • 3. Getting Started with GCP for Data Engineering using GCP Data Analytics
  • 1. Introduction to Getting Started with GCP
  • 2. Pre-requisite Skills to Sign up for course on GCP Data Analytics
  • 3. Overview of Cloud Platforms
  • 4. Overview of Google Cloud Platform or GCP
  • 5. Overview of Signing for GCP Account
  • 6. Create New Google Account using Non Gmail Id
  • 7. Sign up for GCP using Google Account
  • 8. Overview of GCP Credits
  • 9. Overview of GCP Project and Billing
  • 10. Overview of Google Cloud Shell
  • 11. Install Google Cloud SDK on Windows
  • 12. Initialize gcloud CLI using GCP Project
  • 13. Reinitialize Google Cloud Shell with Project id
  • 14. Overview of Analytics Services on GCP
  • 15. Conclusion to Get Started with GCP for Data Engineering

  • 4. Setting up Data Lake using Google Cloud Storage
  • 1. Getting Started with Google Cloud Storage or GCS
  • 2. Overview of Google Cloud Storage or GCS Web UI
  • 3. Upload Folders and Files using into GCS Bucket using GCP Web UI
  • 4. Review GCS Buckets and Objects using gsutil commands
  • 5. Delete GCS Bucket using Web UI
  • 6. Setup Data Repository in Google Cloud Shell
  • 7. Overview of Data Sets
  • 8. Managing Buckets in GCS using gsutil
  • 9. Copy Data Sets into GCS using gsutil
  • 10. Cleanup Buckets in GCS using gsutil
  • 11. Exercise to Manage Buckets and Files in GCS using gsutil
  • 12. Overview of Setting up Data Lake using GCS
  • 13. Setup Google Cloud Libraries in Python Virtual Environment
  • 14. Setup Bucket and Files in GCS using gsutil
  • 15. Getting Started to manage files in GCS using Python
  • 16. Setup Credentials for Python and GCS Integration
  • 17. Review Methods in Google Cloud Storage Python library
  • 18. Get GCS Bucket Details using Python
  • 19. Manage Blobs or Files in GCS using Python
  • 20. Project Problem Statement to Manage Files in GCS using Python
  • 21. Design to Upload multiple files into GCS using Python
  • 22. Get File Names to upload into GCS using Python glob and os
  • 23. Upload all Files to GCS as blobs using Python
  • 24. Validate Files or Blobs in GCS using Python
  • 25. Overview of Processing Data in GCS using Pandas
  • 26. Convert Data to Parquet and Write to GCS using Pandas
  • 27. Design to Upload multiple files into GCS using Pandas
  • 28. Get File Names to upload into GCS using Python glob and os
  • 29. Overview of Parquet File Format and Schemas JSON File
  • 30. Get Column Names for Dataset using Schemas JSON File
  • 31. Upload all Files to GCS as Parquet using Pandas
  • 32. Perform Validation of Files Copied using Pandas

  • 5. Setup Postgres Database using Cloud SQL
  • 1. Overview of GCP Cloud SQL
  • 2. Setup Postgres Database Server using GCP Cloud SQL
  • 3. Configure Network for Cloud SQL Postgres Database
  • 4. Validate Client Tools for Postgres on Mac or PC
  • 5. Setup Database in GCP Cloud SQL Postgres Database Server
  • 6. Setup Tables in GCP Cloud SQL Postgres Database
  • 7. Validate Data in GCP Cloud SQL Postgres Database Tables
  • 8. Integration of GCP Cloud SQL Postgres with Python
  • 9. Overview of Integration of GCP Cloud SQL Postgres with Pandas
  • 10. Read Data From Files to Pandas Data Frame
  • 11. Process Data using Pandas Dataframe APIs
  • 12. Write Pandas Dataframe into Postgres Database Table
  • 13. Validate Data in Postgres Database Tables using Pandas
  • 14. Getting Started with Secrets using GCP Secret Manager
  • 15. Configure Access to GCP Secret Manager via IAM Roles
  • 16. Install Google Cloud Secret Manager Python Library
  • 17. Get Secret Details from GCP Secret Manager using Python
  • 18. Connect to Database using Credentials from Secret Manager
  • 19. Stop GCP Cloud SQL Postgres Database Server

  • 6. Big Data Processing using Google Dataproc
  • 1. Getting Started with GCP Dataproc
  • 2. Setup Single Node Dataproc Cluster for Development
  • 3. Validate SSH Connectivity to Master Node of Dataproc Cluster
  • 4. Allocate Static IP to the Master Node VM of Dataproc Cluster
  • 5. Setup VS Code Remote Window for Dataproc VM
  • 6. Setup Workspace using VS Code on Dataproc
  • 7. Getting Started with HDFS Commands on Dataproc
  • 8. Recap of gsutil to manage files and folders in GCS
  • 9. Review Data Sets setup on Dataproc Master Node VM
  • 10. Copy Local Files into HDFS on Dataproc
  • 11. Copy GCS Files into HDFS on Dataproc.cmproj
  • 12. Validate Pyspark CLI in Dataproc Cluster
  • 13. Validate Spark Scala CLI in Dataproc Cluster
  • 14. Validate Spark SQL CLI in Dataproc Cluster

  • 7. ELT Data Pipelines using Dataproc on GCP
  • 1. Overview of GCP Dataproc Jobs and Workflow
  • 2. Setup JSON Dataset in GCS for Dataproc Jobs
  • 3. Review Spark SQL Commands used for Dataproc Jobs
  • 4. Run Dataproc Job using Spark SQL
  • 5. Overview of Modularizing Spark SQL Applications for Dataproc
  • 6. Review Spark SQL Scripts for Dataproc Jobs and Workflows
  • 7. Validate Spark SQL Script for File Format Conversion
  • 8. Exercise to convert file format using Spark SQL Script
  • 9. Validate Spark SQL Script for Daily Product Revenue
  • 10. Develop Spark SQL Script to Cleanup Databases
  • 11. Copy Spark SQL Scripts to GCS
  • 12. Run and Validate Spark SQL Scripts in GCS
  • 13. Limitations of Running Spark SQL Scripts using Dataproc Jobs
  • 14. Manage Dataproc Clusters using gcloud Commands
  • 15. Run Dataproc Jobs using Spark SQL Command or Query
  • 16. Run Dataproc Jobs using Spark SQL Scripts
  • 17. Exercises to Run Spark SQL Scripts as Dataproc Jobs using gcloud
  • 18. Delete Dataproc Jobs using gcloud commands
  • 19. Importance of using gcloud commands to manage dataproc jobs
  • 20. Getting Started with Dataproc Workflow Templates using Web UI
  • 21. Review Steps and Design to create Dataproc Workflow Template
  • 22. Create Dataproc Workflow Template and Add Cluster using gcloud Commands
  • 23. Review gcloud Commands to Add Jobs to Dataproc Workflow Templates
  • 24. Add Jobs to Dataproc Workflow Template using Commands
  • 25. Instantiate Dataproc Workflow Template to run the Data Pipeline
  • 26. Overview of Dataproc Operations and Deleting Workflow Runs
  • 27. Run and Validate ELT Data Pipeline using Dataproc
  • 28. Stop Dataproc Cluster

  • 8. Big Data Processing using Databricks on GCP
  • 1. Signing up for Databricks on GCP
  • 2. Create Databricks Workspace on GCP
  • 3. Getting Started with Databricks Clusters on GCP
  • 4. Getting Started with Databricks Notebook
  • 5. Overview of Databricks on GCP
  • 6. Overview of Databricks CLI Commands
  • 7. Limitations of Managing DBFS using Databricks CLI
  • 8. Overview of Copying Data Sets into DBFS on GCS
  • 9. Create Folder in GCS using DBFS Commands
  • 10. Upload Data Set into DBFS using GCS Web UI
  • 11. Copy Data Set into DBFS using gsutil
  • 12. Process Data in DBFS using Databricks Spark SQL
  • 13. Getting Started with Spark SQL Example using Databricks
  • 14. Create Temporary Views using Spark SQL
  • 15. Exercise to create temporary views using Spark SQL
  • 16. Spark SQL Query to compute Daily Product Revenue
  • 17. Save Query Result to DBFS using Spark SQL
  • 18. Overview of Pyspark Examples on Databricks.cmproj
  • 19. Process Schema Details in JSON using Pyspark
  • 20. Create Dataframe with Schema from JSON File using Pyspark
  • 21. Transform Data using Spark APIs
  • 22. Get Schema Details for all Data Sets using Pyspark
  • 23. Convert CSV to Parquet with Schema using Pyspark

  • 9. ELT Data Pipelines using Databricks on GCP
  • 1. Overview of Databricks Workflows
  • 2. Pass Arguments to Databricks Python Notebooks
  • 3. Pass Arguments to Databricks SQL Notebooks
  • 4. Create and Run First Databricks Job
  • 5. Run Databricks Jobs and Tasks with Parameters
  • 6. Create and Run Orchestrated Pipeline using Databricks Job
  • 7. Import ELT Data Pipeline Applications into Databricks Environment
  • 8. Spark SQL Application to Cleanup Database and Datasets
  • 9. Review File Format Converter Pyspark Code
  • 10. Review Databricks SQL Notebooks for Tables and Final Results
  • 11. Validate Applications for ELT Pipeline using Databricks
  • 12. Build ELT Pipeline using Databricks Job in Workflows
  • 13. Run and Review Execution details of ELT Data Pipeline using Databricks Job
  • 139,000 تومان
    بیش از یک محصول به صورت دانلودی میخواهید؟ محصول را به سبد خرید اضافه کنید.
    خرید دانلودی فوری

    در این روش نیاز به افزودن محصول به سبد خرید و تکمیل اطلاعات نیست و شما پس از وارد کردن ایمیل خود و طی کردن مراحل پرداخت لینک های دریافت محصولات را در ایمیل خود دریافت خواهید کرد.

    ایمیل شما:
    تولید کننده:
    شناسه: 1813
    حجم: 4249 مگابایت
    مدت زمان: 659 دقیقه
    تاریخ انتشار: ۲۷ دی ۱۴۰۱
    طراحی سایت و خدمات سئو

    139,000 تومان
    افزودن به سبد خرید