001 - Introduction
002 - Learning objectives
003 - Design an Azure Data Lake solution
004 - Recommend file types for storage
005 - Recommend file types for analytical queries
006 - Design for efficient querying
007 - Learning objectives
008 - Design a folder structure that represents levels of data transformation
009 - Design a distribution strategy
010 - Design a data archiving solution
011 - Learning objectives
012 - Design a partition strategy for files
013 - Design a partition strategy for analytical workloads
014 - Design a partition strategy for efficiency and performance
015 - Design a partition strategy for Azure Synapse Analytics
016 - Identify when partitioning is needed in Azure Data Lake Storage Gen2
017 - Learning objectives
018 - Design star schemas
019 - Design slowly changing dimensions
020 - Design a dimensional hierarchy
021 - Design a solution for temporal data
022 - Design for incremental loading
023 - Design analytical stores
024 - Design metastores in Azure Synapse Analytics and Azure Databricks
025 - Learning objectives
026 - Implement compression
027 - Implement partitioning
028 - Implement sharding
029 - Implement different table geometries with Azure Synapse Analytics pools
030 - Implement data redundancy
031 - Implement distributions
032 - Implement data archiving
033 - Learning objectives
034 - Build a temporal data solution
035 - Build a slowly changing dimension
036 - Build a logical folder structure
037 - Build external tables
038 - Implement file and folder structures for efficient querying and data pruning
039 - Learning objectives
040 - Deliver data in a relational star schema
041 - Deliver data in Parquet files
042 - Maintain metadata
043 - Implement a dimensional hierarchy
044 - Learning objectives
045 - Transform data by using Apache Spark
046 - Transform data by using Transact-SQL
047 - Transform data by using Data Factory
048 - Transform data by using Azure Synapse pipelines
049 - Transform data by using Stream Analytics
050 - Learning objectives
051 - Cleanse data
052 - Split data
053 - Shred JSON
054 - Encode and decode data
055 - Learning objectives
056 - Configure error handling for the transformation
057 - Normalize and denormalize values
058 - Transform data by using Scala
059 - Perform data exploratory analysis
060 - Learning objectives
061 - Develop batch processing solutions by using Data Factory, Data Lake, Spark, Azure Syn
062 - Create data pipelines
063 - Design and implement incremental data loads
064 - Design and develop slowly changing dimensions
065 - Handle security and compliance requirements
066 - Scale resources
067 - Learning objectives
068 - Configure the batch size
069 - Design and create tests for data pipelines
070 - Integrate Jupyter and Python Notebooks into a data pipeline
071 - Handle duplicate data
072 - Handle missing data
073 - Handle late-arriving data
074 - Learning objectives
075 - Upsert data
076 - Regress to a previous state
077 - Design and configure exception handling
078 - Configure batch retention
079 - Revisit batch processing solution design
080 - Debug Spark jobs by using the Spark UI
081 - Learning objective
082 - Develop a stream processing solution by using Stream Analytics, Azure Databricks, and
083 - Process data by using Spark structured streaming
084 - Monitor for performance and functional regressions
085 - Design and create windowed aggregates
086 - Handle schema drift
087 - Learning objectives
088 - Process time series data
089 - Process across partitions
090 - Process within one partition
091 - Configure checkpoints and watermarking during processing
092 - Scale resources
093 - Design and create tests for data pipelines
094 - Optimize pipelines for analytical or transactional purposes
095 - Learning objectives
096 - Handle interruptions
097 - Design and configure exception handling
098 - Upsert data
099 - Replay archived stream data
100 - Design a stream processing solution
101 - Learning objectives
102 - Trigger batches
103 - Handle failed batch loads
104 - Validate batch loads
105 - Manage data pipelines in Data Factory and Synapse pipelines
106 - Schedule data pipelines in Data Factory and Synapse pipelines
107 - Implement version control for pipeline artifacts
108 - Manage Spark jobs in a pipeline
109 - Learning objectives
110 - Design data encryption for data at rest and in transit
111 - Design a data auditing strategy
112 - Design a data masking strategy
113 - Design for data privacy
114 - Learning objectives
115 - Design a data retention policy
116 - Design to purge data based on business requirements
117 - Design Azure RBAC and POSIX-like ACL for Data Lake Storage Gen2
118 - Design row-level and column-level security
119 - Learning objectives
120 - Implement data masking
121 - Encrypt data at rest and in motion
122 - Implement row-level and column-level security
123 - Implement Azure RBAC
124 - Implement POSIX-like ACLs for Data Lake Storage Gen2
125 - Implement a data retention policy
126 - Implement a data auditing strategy
127 - Learning objectives
128 - Manage identities, keys, and secrets across different data platforms
129 - Implement secure endpoints Private and public
130 - Implement resource tokens in Azure Databricks
131 - Load a DataFrame with sensitive information
132 - Write encrypted data to tables or Parquet files
133 - Manage sensitive information
134 - Learning objectives
135 - Implement logging used by Azure Monitor
136 - Configure monitoring services
137 - Measure performance of data movement
138 - Monitor and update statistics about data across a system
139 - Monitor data pipeline performance
140 - Measure query performance
141 - Learning objectives
142 - Monitor cluster performance
143 - Understand custom logging options
144 - Schedule and monitor pipeline tests
145 - Interpret Azure Monitor metrics and logs
146 - Interpret a Spark Directed Acyclic Graph (DAG)
147 - Learning objectives
148 - Compact small files
149 - Rewrite user-defined functions (UDFs)
150 - Handle skew in data
151 - Handle data spill
152 - Tune shuffle partitions
153 - Find shuffling in a pipeline
154 - Optimize resource management
155 - Learning objectives
156 - Tune queries by using indexers
157 - Tune queries by using cache
158 - Optimize pipelines for analytical or transactional purposes
159 - Optimize pipeline for descriptive versus analytical workloads
160 - Troubleshoot failed Spark jobs
161 - Troubleshoot failed pipeline runs
162 - Summary