وب سایت تخصصی شرکت فرین
دسته بندی دوره ها
1

Prophecy Data Transformation Copilot for Data Engineering

سرفصل های دوره

Learn Databricks and Spark data engineering to deliver self-service data transformation and speed pipeline development


01 - A warm welcome from Prophecys co-founder
  • 001 Welcome to Prophecy for Data Engineering on Databricks and Spark

  • 02 - The future of data transformation
  • 001 Whats the future of data transformation
  • 002 The evolution of data transformation
  • 003 Ideal data transformation solution for the cloud
  • 004 Prophecy and the future of data transformation
  • 005 How to build the ideal data transformation in the cloud

  • 03 - Data lakes, warehouses, and lakehouses - when to use what (Optional)
  • 001 What is a data lake and the difference between a data lake and data warehouses
  • 002 Introducing data lakehouse and why its the perfect solution

  • 04 - Introduction to Spark and Databricks (Optional)
  • 001 Meet your instructor and module overview
  • 002 Apache Spark architecture and concepts
  • 003 Spark language and tooling
  • 004 From Apache Spark to Databricks - why are they different
  • 005 Data lakehouse, unity catalog, optimization and security
  • 006 Working with Spark best practices
  • 007 Spark and Databricks tips and tricks

  • 05 - Getting started with Prophecy
  • 001 Prophecy Overview - lets learn together!
  • 002 Setting up a Databricks Fabric to execute our Pipelines
  • 003 Create a Prophecy Project to manage our Spark code
  • 004 Getting started with the Pipeline canvas
  • 005 Explore code view and perform simple aggregations
  • 006 Join accounts and opportunities data and write results to a delta table
  • 007 Create a Pipeline and read from Data Sources to start building our Pipeline
  • 008 Deploying Pipelines to production to run our scheduled Pipelines
  • 009 Introduction to Prophecy Users and Teams

  • 06 - Data Sources and Targets
  • 001 Data Sources and Targets overview
  • 002 Parse and read raw data from object store with best practices
  • 003 Prophecy built-in Data Sources and Data Sets
  • 004 Explore Data Source default options
  • 005 Read and parse source parquet data and merge schema
  • 006 Handle corrupt and malformed records when reading from object stores
  • 007 Additional options to handle corrupt and malformed reocrds
  • 008 Work with source data schema and delimiters
  • 009 Read from delta tables as sources
  • 010 Write data to a delta table using a target Gem
  • 011 Partition data when writing to a delta table for optimal performance
  • 012 What weve learned in this module

  • 07 - Data Lakehouse Architecture
  • 001 Data lakehouse and the medallion architecture module overview
  • 002 Medallion architecture - bronze, silver, and gold layer characteristics
  • 003 Read and write data by partition - daily load from object storage
  • 004 Additional data load by partition - daily load from object storage
  • 005 Introduction to data models in a data lakehouse
  • 006 Write the bronze layer data to delta tables
  • 007 Introduction to Slowly Changing Dimensions (SCD)
  • 008 Implement simple SCD2 for bronze layer table
  • 009 Bulk load read and write options
  • 010 Bulk load historical data with SCD2
  • 011 Delta table data versioning
  • 012 Work with incompatible schemas
  • 013 Recover data from a previous version
  • 014 A summary of what weve learned in this module

  • 08 - Building the Silver and Gold Layers
  • 001 Building the Silver and Gold layers - Overview
  • 002 Data integration and cleaning in the Silver layer
  • 003 Build a data model and integrate data in the Silver layer
  • 004 Implement SCD2 in the silver layer
  • 005 Generating unique IDs and write data to delta tables
  • 006 Business requirements for the Gold layer
  • 007 Perform analytics in the Gold layer to build business reports
  • 008 Using subgraphs for reusability to simplify Pipelines
  • 009 A summary of what weve learned in this module

  • 09 - Deploying Pipelines to production
  • 001 Pipeline deployment overview
  • 002 Ways to orchestrate workflows to automate jobs
  • 003 Configure incremental Pipeline to prepare for scheduled runs
  • 004 Create a Prophecy Job to schedule the Pipelines to run daily
  • 005 What is CICD and how to deploy Pipelines to production
  • 006 Advanced use cases integrate with external CICD process using PBT
  • 007 A summary of what weve learned in this module
  • external-links.txt

  • 10 - Managing versions and change control
  • 001 Version management and change control overview
  • 002 Prophecy Projects and the git process
  • 003 Collaborating on a Pipeline - catching dev branch to the main branch
  • 004 Reverting changes when developing a Pipeline before committing
  • 005 Reverting back to a prior commit after committing by using rollback
  • 006 Merging changes and switching between branches
  • 007 Resolving code conflicts with multiple team members are making commits
  • 008 Cloning an exiting Prophecy Project to a new repository
  • 009 Reusing an existing Prophecy Project by importing the Project
  • 010 Creating pull requests and handling commit conflicts
  • 011 A summary of what weve learned in this module

  • 11 - Reusability and extensibility
  • 001 Reusability and extensibility overview
  • 002 The importance of setting data engineering standards - reuse and extend
  • 003 Convert a script to a customized Gem to share and reuse
  • 004 Create a new Gem for multi-dimensional cube using the specified express
  • 005 Create an UI for the cube Gem for users to define the cube
  • 006 Adding additional features to make the customized Gem UI intuitive
  • 007 Error handling with adding validations and customized error messages
  • 008 Testing customized cube Gem and publishing the Gem to share with others
  • 009 Assigning proper access to share the newly built cube Gem
  • 010 Use the newly created cube Gem by adding it a dependency
  • 011 A summary of what weve learned in this module
  • external-links.txt

  • 12 - Data testing
  • 001 Data quality and unit testing overview
  • 002 Medallion architecture and data quality
  • 003 Data quality Pipeline walkthrough - how to populate data quality log
  • 004 Silver layer data quality checks, define errors, and write to delta table
  • 005 Data integration quality checks with joins - check if customer IDs are missing
  • 006 Performing data reconciliation checks - identify mismatching column values
  • 007 Identifying and tracking data quality issues by drilling down to a specific ID
  • 008 Executing data quality checks in phases - stop the pipeline if error exists
  • 009 Unit testing options - testing expressions using output equality
  • 010 Explore code view of the unit test
  • 011 Running the unit tests
  • 012 Unit testing expressions using output predicates
  • 013 A summary of what weve learned in this module
  • 139,000 تومان
    بیش از یک محصول به صورت دانلودی میخواهید؟ محصول را به سبد خرید اضافه کنید.
    افزودن به سبد خرید
    خرید دانلودی فوری

    در این روش نیاز به افزودن محصول به سبد خرید و تکمیل اطلاعات نیست و شما پس از وارد کردن ایمیل خود و طی کردن مراحل پرداخت لینک های دریافت محصولات را در ایمیل خود دریافت خواهید کرد.

    ایمیل شما:
    تولید کننده:
    شناسه: 42444
    حجم: 1569 مگابایت
    مدت زمان: 292 دقیقه
    تاریخ انتشار: ۲۷ دی ۱۴۰۳
    دسته بندی محصول
    طراحی سایت و خدمات سئو

    139,000 تومان
    افزودن به سبد خرید