وب سایت تخصصی شرکت فرین
دسته بندی دوره ها

Prophecy Data Transformation Copilot for Data Engineering

سرفصل های دوره

Learn Databricks and Spark data engineering to deliver self-service data transformation and speed pipeline development


01 - A warm welcome from Prophecys co-founder
  • 001 Welcome to Prophecy for Data Engineering on Databricks and Spark

  • 02 - The future of data transformation
  • 001 Whats the future of data transformation
  • 002 The evolution of data transformation
  • 003 Ideal data transformation solution for the cloud
  • 004 Prophecy and the future of data transformation
  • 005 How to build the ideal data transformation in the cloud

  • 03 - Data lakes, warehouses, and lakehouses - when to use what (Optional)
  • 001 What is a data lake and the difference between a data lake and data warehouses
  • 002 Introducing data lakehouse and why its the perfect solution

  • 04 - Introduction to Spark and Databricks (Optional)
  • 001 Meet your instructor and module overview
  • 002 Apache Spark architecture and concepts
  • 003 Spark language and tooling
  • 004 From Apache Spark to Databricks - why are they different
  • 005 Data lakehouse, unity catalog, optimization and security
  • 006 Working with Spark best practices
  • 007 Spark and Databricks tips and tricks

  • 05 - Getting started with Prophecy
  • 001 Prophecy Overview - lets learn together!
  • 002 Setting up a Databricks Fabric to execute our Pipelines
  • 003 Create a Prophecy Project to manage our Spark code
  • 004 Getting started with the Pipeline canvas
  • 005 Explore code view and perform simple aggregations
  • 006 Join accounts and opportunities data and write results to a delta table
  • 007 Create a Pipeline and read from Data Sources to start building our Pipeline
  • 008 Deploying Pipelines to production to run our scheduled Pipelines
  • 009 Introduction to Prophecy Users and Teams

  • 06 - Data Sources and Targets
  • 001 Data Sources and Targets overview
  • 002 Parse and read raw data from object store with best practices
  • 003 Prophecy built-in Data Sources and Data Sets
  • 004 Explore Data Source default options
  • 005 Read and parse source parquet data and merge schema
  • 006 Handle corrupt and malformed records when reading from object stores
  • 007 Additional options to handle corrupt and malformed reocrds
  • 008 Work with source data schema and delimiters
  • 009 Read from delta tables as sources
  • 010 Write data to a delta table using a target Gem
  • 011 Partition data when writing to a delta table for optimal performance
  • 012 What weve learned in this module

  • 07 - Data Lakehouse Architecture
  • 001 Data lakehouse and the medallion architecture module overview
  • 002 Medallion architecture - bronze, silver, and gold layer characteristics
  • 003 Read and write data by partition - daily load from object storage
  • 004 Additional data load by partition - daily load from object storage
  • 005 Introduction to data models in a data lakehouse
  • 006 Write the bronze layer data to delta tables
  • 007 Introduction to Slowly Changing Dimensions (SCD)
  • 008 Implement simple SCD2 for bronze layer table
  • 009 Bulk load read and write options
  • 010 Bulk load historical data with SCD2
  • 011 Delta table data versioning
  • 012 Work with incompatible schemas
  • 013 Recover data from a previous version
  • 014 A summary of what weve learned in this module

  • 08 - Building the Silver and Gold Layers
  • 001 Building the Silver and Gold layers - Overview
  • 002 Data integration and cleaning in the Silver layer
  • 003 Build a data model and integrate data in the Silver layer
  • 004 Implement SCD2 in the silver layer
  • 005 Generating unique IDs and write data to delta tables
  • 006 Business requirements for the Gold layer
  • 007 Perform analytics in the Gold layer to build business reports
  • 008 Using subgraphs for reusability to simplify Pipelines
  • 009 A summary of what weve learned in this module

  • 09 - Deploying Pipelines to production
  • 001 Pipeline deployment overview
  • 002 Ways to orchestrate workflows to automate jobs
  • 003 Configure incremental Pipeline to prepare for scheduled runs
  • 004 Create a Prophecy Job to schedule the Pipelines to run daily
  • 005 What is CICD and how to deploy Pipelines to production
  • 006 Advanced use cases integrate with external CICD process using PBT
  • 007 A summary of what weve learned in this module
  • external-links.txt

  • 10 - Managing versions and change control
  • 001 Version management and change control overview
  • 002 Prophecy Projects and the git process
  • 003 Collaborating on a Pipeline - catching dev branch to the main branch
  • 004 Reverting changes when developing a Pipeline before committing
  • 005 Reverting back to a prior commit after committing by using rollback
  • 006 Merging changes and switching between branches
  • 007 Resolving code conflicts with multiple team members are making commits
  • 008 Cloning an exiting Prophecy Project to a new repository
  • 009 Reusing an existing Prophecy Project by importing the Project
  • 010 Creating pull requests and handling commit conflicts
  • 011 A summary of what weve learned in this module

  • 11 - Reusability and extensibility
  • 001 Reusability and extensibility overview
  • 002 The importance of setting data engineering standards - reuse and extend
  • 003 Convert a script to a customized Gem to share and reuse
  • 004 Create a new Gem for multi-dimensional cube using the specified express
  • 005 Create an UI for the cube Gem for users to define the cube
  • 006 Adding additional features to make the customized Gem UI intuitive
  • 007 Error handling with adding validations and customized error messages
  • 008 Testing customized cube Gem and publishing the Gem to share with others
  • 009 Assigning proper access to share the newly built cube Gem
  • 010 Use the newly created cube Gem by adding it a dependency
  • 011 A summary of what weve learned in this module
  • external-links.txt

  • 12 - Data testing
  • 001 Data quality and unit testing overview
  • 002 Medallion architecture and data quality
  • 003 Data quality Pipeline walkthrough - how to populate data quality log
  • 004 Silver layer data quality checks, define errors, and write to delta table
  • 005 Data integration quality checks with joins - check if customer IDs are missing
  • 006 Performing data reconciliation checks - identify mismatching column values
  • 007 Identifying and tracking data quality issues by drilling down to a specific ID
  • 008 Executing data quality checks in phases - stop the pipeline if error exists
  • 009 Unit testing options - testing expressions using output equality
  • 010 Explore code view of the unit test
  • 011 Running the unit tests
  • 012 Unit testing expressions using output predicates
  • 013 A summary of what weve learned in this module
  • 139,000 تومان
    بیش از یک محصول به صورت دانلودی میخواهید؟ محصول را به سبد خرید اضافه کنید.
    خرید دانلودی فوری

    در این روش نیاز به افزودن محصول به سبد خرید و تکمیل اطلاعات نیست و شما پس از وارد کردن ایمیل خود و طی کردن مراحل پرداخت لینک های دریافت محصولات را در ایمیل خود دریافت خواهید کرد.

    ایمیل شما:
    تولید کننده:
    شناسه: 42444
    حجم: 1569 مگابایت
    مدت زمان: 292 دقیقه
    تاریخ انتشار: ۲۷ دی ۱۴۰۳
    دسته بندی محصول
    طراحی سایت و خدمات سئو

    139,000 تومان
    افزودن به سبد خرید